I wrote the following C code: #define SYS_write 1 #define SYS_exit 60 void _start() { static const char str[] = "Hello, world!\n"; volatile register long rax asm("rax"); volatile register long rdi asm("rdi"); volatile register long rsi asm("rsi"); volatile register long rdx asm("rdx"); rax = SYS_write; rdi = 1; rsi = (long)str; rdx = sizeof(str); /* rax = SYS_write; */ asm("syscall\n\t"); rax = SYS_exit; rdi = 0; asm("syscall\n\t"); } This gets compiled by 'gcc asm.c -o asm -nostdlib' to: 0000000000001000 <_start>: 1000: 55 push %rbp 1001: 48 89 e5 mov %rsp,%rbp 1004: b8 01 00 00 00 mov $0x1,%eax 1009: bf 01 00 00 00 mov $0x1,%edi 100e: 48 8d 05 eb 0f 00 00 lea 0xfeb(%rip),%rax # 2000 <str.0> 1015: 48 89 c6 mov %rax,%rsi 1018: ba 0f 00 00 00 mov $0xf,%edx 101d: 0f 05 syscall 101f: b8 3c 00 00 00 mov $0x3c,%eax 1024: bf 00 00 00 00 mov $0x0,%edi 1029: 0f 05 syscall 102b: 90 nop 102c: 5d pop %rbp 102d: c3 ret For some reason, instead of 'lea 0xfeb(%rip),%rsi', the compiler emits: lea 0xfeb(%rip),%rax mov %rax,%rsi Using an extra mov that I didn't write and overwriting rax, resulting in incorrect code. If I uncomment /* rax = SYS_write; */, then the code works, but still has an extra mov instruction. 0000000000001000 <_start>: 1000: 55 push %rbp 1001: 48 89 e5 mov %rsp,%rbp 1004: b8 01 00 00 00 mov $0x1,%eax 1009: bf 01 00 00 00 mov $0x1,%edi 100e: 48 8d 05 eb 0f 00 00 lea 0xfeb(%rip),%rax # 2000 <str.0> 1015: 48 89 c6 mov %rax,%rsi 1018: ba 0f 00 00 00 mov $0xf,%edx 101d: b8 01 00 00 00 mov $0x1,%eax 1022: 0f 05 syscall 1024: b8 3c 00 00 00 mov $0x3c,%eax 1029: bf 00 00 00 00 mov $0x0,%edi 102e: 0f 05 syscall 1030: 90 nop 1031: 5d pop %rbp 1032: c3 ret If I then enable optimizations, with -O or higher, both versions get compiled to: 0000000000001000 <_start>: 1000: 0f 05 syscall 1002: 0f 05 syscall 1004: c3 ret Which is wrong. Am I doing something wrong to cause this to happen? Even if this didn't result in non-working code, why does the compiler lea into rax and mov into rsi instead of directly lea into rsi, which is what I told it to do and would be faster?
This isn't a packaging bug or an issue in a Gentoo package being built by GCC. Please take it to the upstream gcc-help mailing list.