Question:
there is a function
int foo(int num) {
if(num)
return 1;
else
return 3;
}
I understand the output without optimizations:
foo(int):
pushq %rbp
movq %rsp, %rbp
movl %edi, -4(%rbp)
cmpl $0, -4(%rbp)
je .L2
movl $1, %eax
jmp .L3
.L2:
movl $3, %eax
.L3:
popq %rbp
ret
but it is not at all clear what is happening with O2:
foo(int):
cmpl $1, %edi
sbbl %eax, %eax
andl $2, %eax
addl $1, %eax
ret
why use SuBtract with Borrow at all … besides, if you replace the return value of return 3 with return 2, then in general somehow strangely everything works out
foo(int):
xorl %eax, %eax
testl %edi, %edi
sete %al
addl $1, %eax
ret
clarify a little what goes where with optimizations … otherwise I can’t enter something in any way
Answer:
The compiler is not required to generate understandable and/or easy-to-understand code. On the other hand, if you carry out all calculations on a piece of paper for all inputs and outputs, then the logic of the function will be exactly the same:
foo(int):
cmpl $1, %edi # устанавливает CF, если %edi-1<0 т.е. если %edi==0
sbbl %eax, %eax # %eax = CF ? 0xFFFFFFFF : 0
andl $2, %eax # %eax &= 2 т.е. в зависимости от CF: %eax=={2|0}
addl $1, %eax # %eax += 1 т.е. %eax=={3|1}
ret
In the second case, everything is simpler, you can rewrite it in an exemplary C pseudocode:
foo(int):
xorl %eax, %eax # int rv=0;
testl %edi, %edi # if(num==0)
sete %al # rv = 1;
addl $1, %eax # rv++;
ret # return rv;
The idea of these optimizations is to get rid of conditional branch instructions, which on modern (i586+) CPUs, if the prediction block does not guess correctly, cause the pipeline to reset, and therefore significantly slow down the calculations.