asma linux gcc exhaust with and without -O2

Question:

there is a function

int foo(int num) {
if(num)
    return 1;
else
    return 3;
}

I understand the output without optimizations:

foo(int):
  pushq %rbp
  movq %rsp, %rbp
  movl %edi, -4(%rbp)
  cmpl $0, -4(%rbp)
  je .L2
  movl $1, %eax
  jmp .L3
.L2:
  movl $3, %eax
.L3:
  popq %rbp
  ret

but it is not at all clear what is happening with O2:

foo(int):
  cmpl $1, %edi
  sbbl %eax, %eax
  andl $2, %eax
  addl $1, %eax
  ret

why use SuBtract with Borrow at all … besides, if you replace the return value of return 3 with return 2, then in general somehow strangely everything works out

 foo(int):
  xorl %eax, %eax
  testl %edi, %edi
  sete %al
  addl $1, %eax
  ret

clarify a little what goes where with optimizations … otherwise I can’t enter something in any way

Answer:

The compiler is not required to generate understandable and/or easy-to-understand code. On the other hand, if you carry out all calculations on a piece of paper for all inputs and outputs, then the logic of the function will be exactly the same:

foo(int):
  cmpl $1, %edi    # устанавливает CF, если %edi-1<0 т.е. если %edi==0
  sbbl %eax, %eax  # %eax = CF ? 0xFFFFFFFF : 0
  andl $2, %eax    # %eax &= 2 т.е. в зависимости от CF: %eax=={2|0}
  addl $1, %eax    # %eax += 1 т.е. %eax=={3|1}
  ret

In the second case, everything is simpler, you can rewrite it in an exemplary C pseudocode:

 foo(int):
  xorl %eax, %eax   # int rv=0;
  testl %edi, %edi  # if(num==0)
  sete %al          #   rv = 1;
  addl $1, %eax     # rv++;
  ret               # return rv;

The idea of ​​these optimizations is to get rid of conditional branch instructions, which on modern (i586+) CPUs, if the prediction block does not guess correctly, cause the pipeline to reset, and therefore significantly slow down the calculations.

Scroll to Top