About gcc-compiled x86_64 code and C code optimization -


i compiled following c code:

typedef struct {     long x, y, z; } foo;  long bar(foo *f, long i) {     return f[i].x + f[i].y + f[i].z; } 

with command gcc -s -o3 test.c. here bar function in output:

    .section    __text,__text,regular,pure_instructions     .globl  _bar     .align  4, 0x90 _bar: leh_func_begin1:     pushq   %rbp ltmp0:     movq    %rsp, %rbp ltmp1:     leaq    (%rsi,%rsi,2), %rcx     movq    8(%rdi,%rcx,8), %rax     addq    (%rdi,%rcx,8), %rax     addq    16(%rdi,%rcx,8), %rax     popq    %rbp     ret leh_func_end1: 

i have few questions assembly code:

  1. what purpose of "pushq %rbp", "movq %rsp, %rbp", , "popq %rbp", if neither rbp nor rsp used in body of function?
  2. why rsi , rdi automatically contain arguments c function (i , f, respectively) without reading them stack?
  3. i tried increasing size of foo 88 bytes (11 longs) , leaq instruction became imulq. make sense design structs have "rounder" sizes avoid multiply instructions (in order optimize array access)? leaq instruction replaced with:

    imulq   $88, %rsi, %rcx 

  1. the function building own stack frame these instructions. there's nothing unusual them. should note, though, due function's small size, inlined when used in code. compiler required produce "normal" version of function, though. also, @ouah said in answer.

  2. this because that's how amd64 abi specifies arguments should passed functions.

    if class integer, next available register of sequence %rdi, %rsi, %rdx, %rcx, %r8 , %r9 used.

    page 20, amd64 abi draft 0.99.5 – september 3, 2010

  3. this not directly related structure size, rather - absolute address function has access. if size of structure 24 bytes, f address of array containing structures, , i index @ array has accessed, byte offset each structure i*24. multiplying 24 in case achieved combination of lea , sib addressing. first lea instruction calculates i*3, every subsequent instruction uses i*3 , multiplies further 8, therefore accessing array @ needed absolute byte offset, , using immediate displacements access individual structure members ((%rdi,%rcx,8). 8(%rdi,%rcx,8), , 16(%rdi,%rcx,8)). if make size of structure 88 bytes, there no way of doing such thing swiftly combination of lea , kind of addressing. compiler assumes simple imull more efficient in calculating i*88 series of shifts, adds, leas or else.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -