Similarly to 346821, I can no longer build chromium in my chroot targeting a netbook with -march=atom. I'll attach a compressed build.log Reproducible: Always
Created attachment 295795 [details] build.log from failed emerge
Created attachment 295797 [details] emerge --info output
chromium depends on 'nacl-toolchain-newlib' 'nacl-toolchain-newlib' depends on gcc-4.4 gcc-4.4 doesn't support -march=atom. http://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/i386-and-x86_002d64-Options.html http://gcc.gnu.org/onlinedocs/gcc/i386-and-x86_002d64-Options.html You could use the following flags: -march=core2 -mtune=generic -m80387 -mfp-ret-in-387
I'm a bit confused. You cannot compile it with > gcc-4.4? The ebuild certainly allowed it. If that is the case, can I set up per-package CFLAGS similarly as described here http://en.gentoo-wiki.com/wiki/Portage_TMPDIR_on_tmpfs#By_portage_configuration for trouble packages? (Like I mentioned, xulrunner based projects fail on atom as well currently)
From the build log: tools/ld_bfd/ld: exec /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../x86_64-pc-linux-gnu/bin/ld.bfd -m elf_x86_64 --build-id -static --script=nacl/nacl_helper_bootstrap_linux.x -o /var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/nacl_helper_bootstrap_raw -z max-page-size=0x1000 --whole-archive /var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/obj/gen/chrome/libnacl_helper_bootstrap_lib.a --no-whole-archive make: *** [out/Release/nacl_helper_bootstrap_raw] Error 252 make: *** Waiting for unfinished jobs.... * ERROR: www-client/chromium-16.0.912.63 failed (compile phase): * emake failed Could you confirm that /usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../x86_64-pc-linux-gnu/bin/ld.bfd exists on your system? Have you tried with different CFLAGS? (I don't think -march=atom is likely to cause this failure, but we should confirm) Have you tried 17.x builds?
/usr/lib/gcc/x86_64-pc-linux-gnu/4.5.3/../../../../x86_64-pc-linux-gnu/bin/ld.bfd does indeed exist on the build machine. I just tried switching to -march=core2, and recompiling v8, nacl, and chromium, same result. Then I unmasked 17.x (as well as v8-3.7.6 and nacl-toolchain-newlib-0_p7311) and tried emerging (again with core2, not atom) and it failed in the same way. Again, I'm trying to build with in a chroot, all my other machines (compiling for themselves) updated just fine.
(In reply to comment #6) > Again, I'm trying to build with in a chroot, all my other machines (compiling > for themselves) updated just fine. Could you post what are the differences between the chroot and machines on which chromium emerged successfully?
Sure, what type of things are you interested in? emerge --info of course, anything else?
Created attachment 296369 [details] emerge --info from successful build machine (non-chroot)
(In reply to comment #8) > Sure, what type of things are you interested in? emerge --info of course, > anything else? I tried to find minimal set of differences between the emerge --infos, and here's what I came up with: The GOOD system, on which there is no compile failure: Portage 2.1.10.41 (default/linux/amd64/10.0/desktop/kde, gcc-4.5.3, glibc-2.13-r4, 3.0.6-gentoo x86_64) System uname: Linux-3.0.6-gentoo-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q9550_@_2.83GHz-with-gentoo-2.0.3 Timestamp of tree: Mon, 19 Dec 2011 13:00:01 +0000 distcc 3.1 x86_64-pc-linux-gnu [enabled] sys-devel/autoconf: 2.13, 2.68 sys-devel/automake: 1.9.6-r3, 1.10.3, 1.11.1 sys-devel/gcc: 4.5.3-r1, 4.6.2 CFLAGS="-O2 -march=core2 -msse4.1 -pipe" CONFIG_PROTECT="" CONFIG_PROTECT_MASK="/etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXXFLAGS="-O2 -march=core2 -msse4.1 -pipe" FEATURES="distcc" MAKEOPTS="-j15" USE="cdr cups dvd sse4_1 ssse3 x264" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" INPUT_DEVICES="" VIDEO_CARDS="nvidia" The BAD system, on which there is a compile failure: Portage 2.1.10.11 (default/linux/amd64/10.0/desktop/kde, gcc-4.5.3, glibc-2.13-r4, 3.0.6-gentoo x86_64) System uname: Linux-3.0.6-gentoo-x86_64-Intel-R-_Xeon-R-_CPU_X5650_@_2.67GHz-with-gentoo-2.0.3 Timestamp of tree: Wed, 14 Dec 2011 11:30:01 +0000 sys-devel/autoconf: 2.68 sys-devel/automake: 1.11.1 sys-devel/gcc: 4.5.3-r1 CFLAGS="-O2 -march=atom -pipe" CONFIG_PROTECT="/usr/share/openvpn/easy-rsa" CONFIG_PROTECT_MASK="" CXXFLAGS="-O2 -march=atom -pipe" FEATURES="buildpkg fixpackages" MAKEOPTS="-j13" USE="laptop wifi" ALSA_CARDS="hda_intel" INPUT_DEVICES="synaptics" VIDEO_CARDS="intel" I have no obvious conclusions from the above. Some random things you could try: 1. What happens when you disable distcc on the working system? Maybe you were just lucky and the critical parts got compiled on a different system with different settings. 2. One of the different USE flags is cups. It's in chromium's IUSE, and it shouldn't make a difference here, but could you see what happens if you turn it off on the GOOD system and on on the BAD system? 3. We have another mysterious NaCl-related build failure, bug #394645 . Could you see the emerge --info from there and look for anything in common? 4. Run "equery files nacl-toolchain-newlib" on both systems and post the diff (sort -u if needed).
Can you try running the failing command line directly? From the first build log: export LD_LIBRARY_PATH=/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/lib.host:/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/lib.target:$LD_LIBRARY_PATH; cd chrome; mkdir -p /var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release; ../tools/ld_bfd/ld -m elf_x86_64 --build-id -static "--script=nacl/nacl_helper_bootstrap_linux.x" -o "/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/nacl_helper_bootstrap_raw" -z "max-page-size=0x1000" --whole-archive "/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/obj/gen/chrome/libnacl_helper_bootstrap_lib.a" --no-whole-archive If that reproduces the problem, then next try the exact command printed after "tools/ld_bfd/ld: exec ". My best guess so far is that either there is a problem running the linker or that the linker itself crashes. But it's hard to tell exactly what the failure is through all the build-system layers.
(In reply to comment #10) > I have no obvious conclusions from the above. Some random things you could try: > > 1. What happens when you disable distcc on the working system? Maybe you were > just lucky and the critical parts got compiled on a different system with > different settings. No difference. I have yet a 3rd machine which builds chromium successfully without distcc. Just to be sure, I tried on the exact system that I gave emerge --info from, and it still works. > 2. One of the different USE flags is cups. It's in chromium's IUSE, and it > shouldn't make a difference here, but could you see what happens if you turn it > off on the GOOD system and on on the BAD system? No difference. > 3. We have another mysterious NaCl-related build failure, bug #394645 . Could > you see the emerge --info from there and look for anything in common? Hmmm nothing my untrained eye can pick up > 4. Run "equery files nacl-toolchain-newlib" on both systems and post the diff > (sort -u if needed). diffs are the same :(
(In reply to comment #11) Funny I was in the middle of manually running make inside of the staged /var/tmp/ directory to see if I could sniff something out. > Can you try running the failing command line directly? > From the first build log: > export > LD_LIBRARY_PATH=/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/lib.host:/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/lib.target:$LD_LIBRARY_PATH; > cd chrome; mkdir -p > /var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release; > ../tools/ld_bfd/ld -m elf_x86_64 --build-id -static > "--script=nacl/nacl_helper_bootstrap_linux.x" -o > "/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/nacl_helper_bootstrap_raw" > -z "max-page-size=0x1000" --whole-archive > "/var/tmp/portage/www-client/chromium-16.0.912.63/work/chromium-16.0.912.63/out/Release/obj/gen/chrome/libnacl_helper_bootstrap_lib.a" > --no-whole-archive That works just fine, after running it I am in the chrome/ directory (as expected) > > If that reproduces the problem, then next try the exact command printed after > "tools/ld_bfd/ld: exec ". My best guess so far is that either there is a > problem running the linker or that the linker itself crashes. But it's hard to > tell exactly what the failure is through all the build-system layers. When I do this, it not only crashes, but exits my chroot. After setting up a second time, and running inside of a second bash, I see that it exits with "Illegal Instruction", and the return code from the exited bash (echo $?) is 132. I'm not sure if that is bashs code, or the aborted linker return code. So does it seem that the cross compiled linker has code specific for the march=atom that the host the chroot is running on cannot execute?
I don't know exactly what explains the problem, but if the linker is crashing then that certainly has nothing to do with chromium or nacl per se. Either it's a bug in the linker itself or a problem with how it was built. If you run the linker under gdb, you should be able to see where it's going wrong. (Or you could "ulimit -c unlimited" before running it and then examine the core file left behind.)
(In reply to comment #14) Ok so I recompiled binutils (package containing ld.bfd) with -ggdb and splitdebug etc. and got a core dump. The backtrace is: #0 sha1_process_block (buffer=0x7fff877ab2f0, len=<optimized out>, ctx=0x7fff877ab3d0) at /usr/src/debug/sys-devel/binutils-2.21.1-r1/binutils-2.21.1/libiberty/sha1.c:319 #1 0x000000000042c0e8 in sha1_process_bytes (buffer=<optimized out>, len=64, ctx=0x7fff877ab3d0) at /usr/src/debug/sys-devel/binutils-2.21.1-r1/binutils-2.21.1/libiberty/sha1.c:245 #2 0x00007f12bf386d12 in bfd_elf64_checksum_contents (abfd=0x69a4b0, process=0x42c040 <sha1_process_bytes>, arg=0x7fff877ab3d0) at /usr/src/debug/sys-devel/binutils-2.21.1-r1/binutils-2.21.1/bfd/elfcode.h:1136 #3 0x00000000004230c7 in gldelf_x86_64_write_build_id_section (abfd=0x69a4b0) at eelf_x86_64.c:921 #4 0x00007f12bf38ece0 in _bfd_elf_write_object_contents (abfd=0x69a4b0) at /usr/src/debug/sys-devel/binutils-2.21.1-r1/binutils-2.21.1/bfd/elf.c:5244 #5 0x00007f12bf36ba39 in bfd_close (abfd=0x69a4b0) at /usr/src/debug/sys-devel/binutils-2.21.1-r1/binutils-2.21.1/bfd/opncls.c:699 #6 0x000000000041905c in main (argc=13, argv=0x7fff877ab7b8) at /usr/src/debug/sys-devel/binutils-2.21.1-r1/binutils-2.21.1/ld/ldmain.c:502 And the offending line (319) ======== snip ========= 314 + K \ 315 + M; \ 316 B = rol( B, 30 ); \ 317 } while(0) 318 319 while (words < endp) 320 { 321 sha1_uint32 tm; 322 int t; 323 for (t = 0; t < 16; t++) ======== snip ========= But trying to print the variables words, I get: (gdb) p words $4 = <optimized out>
Going to recompile without optimization and see if anything changes
(In reply to comment #15) > Ok so I recompiled binutils (package containing ld.bfd) with -ggdb and > splitdebug etc. and got a core dump. The backtrace is: Please show "info reg" and "disas $pc" at frame 0.
Compiling without any -O level the offending linker command does not crash. (gdb) info reg rax 0x98badcfe 2562383102 rbx 0x7fff877ab2f0 140735466353392 rcx 0xefcdab89 4023233417 rdx 0x0 0 rsi 0x7fff877ab120 140735466352928 rdi 0x7fff877ab2f0 140735466353392 rbp 0x7fff877ab3d0 0x7fff877ab3d0 rsp 0x7fff877ab128 0x7fff877ab128 r8 0x67452301 1732584193 r9 0x98badcfe 2562383102 r10 0x0 0 r11 0x246 582 r12 0x7fff877ab330 140735466353456 r13 0x10325476 271733878 r14 0xc3d2e1f0 3285377520 r15 0x7fff877ab2f0 140735466353392 rip 0x42ac38 0x42ac38 <sha1_process_block+136> eflags 0x10246 [ PF ZF IF RF ] cs 0x33 51 ss 0x2b 43 ds 0x0 0 es 0x0 0 fs 0x0 0 gs 0x0 0 and Dump of assembler code for function sha1_process_block: 0x000000000042abb0 <+0>: push %r15 0x000000000042abb2 <+2>: mov %rsi,%rax 0x000000000042abb5 <+5>: push %r14 0x000000000042abb7 <+7>: and $0xfffffffffffffffc,%rax 0x000000000042abbb <+11>: push %r13 0x000000000042abbd <+13>: add %rdi,%rax 0x000000000042abc0 <+16>: push %r12 0x000000000042abc2 <+18>: push %rbp 0x000000000042abc3 <+19>: push %rbx 0x000000000042abc4 <+20>: lea -0x40(%rsp),%rsp 0x000000000042abc9 <+25>: mov 0x4(%rdx),%ecx 0x000000000042abcc <+28>: mov %rdx,-0x30(%rsp) 0x000000000042abd1 <+33>: mov -0x30(%rsp),%rbp 0x000000000042abd6 <+38>: mov %rax,-0x18(%rsp) 0x000000000042abdb <+43>: mov (%rdx),%eax 0x000000000042abdd <+45>: mov %eax,-0x1c(%rsp) 0x000000000042abe1 <+49>: mov %eax,%ebx 0x000000000042abe3 <+51>: mov 0xc(%rdx),%r13d 0x000000000042abe7 <+55>: mov 0x8(%rdx),%eax 0x000000000042abea <+58>: mov 0x10(%rdx),%r14d 0x000000000042abee <+62>: mov %esi,%edx 0x000000000042abf0 <+64>: add 0x14(%rbp),%edx 0x000000000042abf3 <+67>: mov %rdi,-0x28(%rsp) ---Type <return> to continue, or q <return> to quit--- 0x000000000042abf8 <+72>: mov %edx,0x14(%rbp) 0x000000000042abfb <+75>: mov %edx,%edx 0x000000000042abfd <+77>: cmp %rdx,%rsi 0x000000000042ac00 <+80>: jbe 0x42ac06 <sha1_process_block+86> 0x000000000042ac02 <+82>: addl $0x1,0x18(%rbp) 0x000000000042ac06 <+86>: cmp -0x18(%rsp),%rdi 0x000000000042ac0b <+91>: lea -0x8(%rsp),%r8 0x000000000042ac10 <+96>: mov %eax,%r9d 0x000000000042ac13 <+99>: mov %r8,-0x10(%rsp) 0x000000000042ac18 <+104>: mov %ebx,%r8d 0x000000000042ac1b <+107>: jae 0x42c02d <sha1_process_block+5245> 0x000000000042ac21 <+113>: nopl 0x0(%rax) 0x000000000042ac28 <+120>: xor %edx,%edx 0x000000000042ac2a <+122>: mov -0x28(%rsp),%rbx 0x000000000042ac2f <+127>: mov -0x10(%rsp),%rsi 0x000000000042ac34 <+132>: nopl 0x0(%rax) => 0x000000000042ac38 <+136>: movbe (%rbx,%rdx,1),%eax 0x000000000042ac3d <+141>: mov %eax,(%rsi,%rdx,1) 0x000000000042ac40 <+144>: lea 0x4(%rdx),%rdx 0x000000000042ac44 <+148>: cmp $0x40,%rdx 0x000000000042ac48 <+152>: jne 0x42ac38 <sha1_process_block+136> 0x000000000042ac4a <+154>: mov -0x8(%rsp),%ebp 0x000000000042ac4e <+158>: mov %r8d,%eax 0x000000000042ac51 <+161>: rol $0x5,%eax ---Type <return> to continue, or q <return> to quit--- 0x000000000042ac54 <+164>: mov -0x4(%rsp),%r12d 0x000000000042ac59 <+169>: mov (%rsp),%esi 0x000000000042ac5c <+172>: mov 0x4(%rsp),%ebx 0x000000000042ac60 <+176>: addq $0x40,-0x28(%rsp) 0x000000000042ac66 <+182>: mov 0xc(%rsp),%r11d 0x000000000042ac6b <+187>: lea 0x5a827999(%r14,%rbp,1),%edi 0x000000000042ac73 <+195>: add %eax,%edi 0x000000000042ac75 <+197>: mov %r13d,%eax 0x000000000042ac78 <+200>: lea 0x5a827999(%r13,%r12,1),%r10d 0x000000000042ac80 <+208>: xor %r9d,%eax 0x000000000042ac83 <+211>: mov 0x14(%rsp),%r14d 0x000000000042ac88 <+216>: and %ecx,%eax 0x000000000042ac8a <+218>: xor %r13d,%eax 0x000000000042ac8d <+221>: mov 0x10(%rsp),%r13d 0x000000000042ac92 <+226>: rol $0x1e,%ecx 0x000000000042ac95 <+229>: add %eax,%edi 0x000000000042ac97 <+231>: mov %ecx,%eax 0x000000000042ac99 <+233>: mov %edi,%edx 0x000000000042ac9b <+235>: xor %r9d,%eax 0x000000000042ac9e <+238>: and %r8d,%eax 0x000000000042aca1 <+241>: xor %r9d,%eax 0x000000000042aca4 <+244>: lea 0x5a827999(%r9,%rsi,1),%r9d 0x000000000042acac <+252>: rol $0x1e,%r8d 0x000000000042acb0 <+256>: add %eax,%r10d
"movbe" is an instruction that doesn't exist on all chip variants. I don't know off hand which chips do or don't have it. With a new-enough kernel "grep '^flags' /proc/cpuinfo" will or won't have "movbe" in its list. So indeed it seems likely that you are just using a linker built with compilation options that require a different chip than the one you're running on.
Indeed the host (Xeon 5650) does not have that instruction. What would the proper way to workaround this bug be? I would like to build packages for march=atom, but obviously ones that need to run on the building host (ld, gcc, python) cannot run if they use the movbe instruction (In reply to comment #19) > "movbe" is an instruction that doesn't exist on all chip variants. > I don't know off hand which chips do or don't have it. > With a new-enough kernel "grep '^flags' /proc/cpuinfo" will or won't have > "movbe" in its list. > > So indeed it seems likely that you are just using a linker built with > compilation options that require a different chip than the one you're running > on.
That's a question for Gentoo experts, which I am not at all. Having verified that this actually has nothing to do with chromium or nacl, I'm going to bow out now.
(In reply to comment #21) Well I think a huge "thank you" is in order in any case for helping diagnose!
If I understand this correctly, you will need to install binutils in the chroot with CFLAGS that your host/build system supports. If you want to create a copy that is optimized for your target system, you could use emerge --buildpkgonly to create a binpkg with a different set of CFLAGS. Then, once you transfer the image to your target system, re-install binutils from the binpkg with emerge --usepkg.
(In reply to comment #23) Well, for the linker I guess I don't care so much if it's "optimized" on the target platform. So that I don't need to keep switching around CFLAGS in /etc/make.conf, I created the file /etc/portage/env/sys-devel/binutils containing the lines: CFLAGS="-O2 -march=native -pipe" CXXFLAGS="${CFLAGS}" and that worked perfectly. It only applies those CFLAGS/CXXFLAGS to that packages. I've used a similar trick with FEATURES="splitdebug" for just the glibc package so that valgrind works.
(In reply to comment #24) I would be more conservative and drop -march=native since it may generate instructions not supported on your target system. Also, you probably know this, but you may have other binaries that have instructions that are invalid on your Xeon system. You just haven't run into them yet.
(In reply to comment #25) > > I would be more conservative and drop -march=native since it may generate > instructions not supported on your target system. Ahh good call, I'll choose something they both support > > Also, you probably know this, but you may have other binaries that have > instructions that are invalid on your Xeon system. You just haven't run into > them yet. Yeah I have a similar problem building firefox/thunderbird, and hope this solution (or some variant) will work. As long as the toolchain runs on this host, I'm good :) But some large projects that bootstrap themselves with tools they build internally will fail.
(In reply to comment #20) > Indeed the host (Xeon 5650) does not have that instruction. What would the > proper way to workaround this bug be? There is no bug (more on that below). Thank you Roland for helping with the diagnosis. I very much appreciate that. Now that the problem is understood I can offer more advice, possibly off-bugzilla. > I would like to build packages for > march=atom, but obviously ones that need to run on the building host (ld, gcc, > python) cannot run if they use the movbe instruction This would be more suitable for a forums.gentoo.org thread (feel free to start one and send me the link). Something that should work is -march=i686 -mtune=atom (maybe there is a higher common subset than i686, you'd have to check that). I wouldn't worry too much about "missing optimizations".
*** Bug 405333 has been marked as a duplicate of this bug. ***
*** Bug 435908 has been marked as a duplicate of this bug. ***
*** Bug 436414 has been marked as a duplicate of this bug. ***