Summary: | sys-devel/gcc-4.[45]: building packages with -mavx causes crashes | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Brian <caligatio> |
Component: | Current packages | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | aidanamarks, chemacg, clemente.aguiar, deduktionstheorem, f5d8fd51ed1e804c9e8d0357e8614e0493b06e96, gentoo, lee, marduk, mephinet, shiningarcanine, stardub, thomas.lindroth |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Brian
2011-01-23 13:56:32 UTC
check dmesg to see if it is logging an error, also you will need to provide a backtrace. Nothing is being logged to dmesg. I attempted to get a backtrace with gdb (following the directions at http://www.gentoo.org/proj/en/qa/backtraces.xml) but it says "No Stack." Here's the c&p from GDB: gdb firefox GNU gdb (Gentoo 7.2 p1) 7.2 Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-pc-linux-gnu". For bug reporting instructions, please see: <http://bugs.gentoo.org/>... Reading symbols from /usr/bin/firefox...done. (gdb) run Starting program: /usr/bin/firefox [Thread debugging using libthread_db enabled] [New Thread 0x7fffe5df6710 (LWP 28403)] [New Thread 0x7fffe55f5710 (LWP 28404)] [New Thread 0x7fffe4df4710 (LWP 28405)] [New Thread 0x7fffe45f3710 (LWP 28406)] [Thread 0x7fffe45f3710 (LWP 28406) exited] [Thread 0x7fffe4df4710 (LWP 28405) exited] [Thread 0x7fffe55f5710 (LWP 28404) exited] [Thread 0x7fffe5df6710 (LWP 28403) exited] process 28400 is executing new program: /usr/lib64/firefox/firefox [Thread debugging using libthread_db enabled] [New Thread 0x7fffe5bf2710 (LWP 28407)] [New Thread 0x7fffe53f1710 (LWP 28408)] [New Thread 0x7fffe4bf0710 (LWP 28409)] [New Thread 0x7fffe43ef710 (LWP 28410)] [New Thread 0x7fffe3bee710 (LWP 28411)] [New Thread 0x7fffe1e34710 (LWP 28412)] [Thread 0x7fffe43ef710 (LWP 28410) exited] [Thread 0x7fffe1e34710 (LWP 28412) exited] [Thread 0x7fffe3bee710 (LWP 28411) exited] [Thread 0x7fffe4bf0710 (LWP 28409) exited] [Thread 0x7fffe53f1710 (LWP 28408) exited] [Thread 0x7fffe5bf2710 (LWP 28407) exited] Program exited with code 01. (gdb) set logging file backtrace.log (gdb) set logging on Copying output to backtrace.log. (gdb) bt No stack. (gdb) set logging off Done logging to backtrace.log. (gdb) quit I also attempted to run /usr/lib64/firefox/firefox and got the same result. After some Googling, I decided to fiddle with "-profilemanager" and via some combination of deleting .mozilla and -profilemanager, got Firefox to launch. (Un)fortunately it's segfaulting regularly but I did manage to get a backtrace of a segfault: #0 nsDataDocumentContentPolicy::ShouldLoad (this=<value optimized out>, aContentType=3, aContentLocation=<value optimized out>, aRequestingLocation=0x19d60d0, aRequestingContext=0x1d549b0, aMimeGuess=..., aExtra=0xc, aDecision=0x7ffff5e1ca3b) at nsDataDocumentContentPolicy.cpp:61 #1 0x00007ffff57cfa90 in nsContentPolicy::CheckPolicy (this=0xb4d000, policyMethod=&virtual nsIContentPolicy::ShouldLoad(PRUint32, nsIURI*, nsIURI*, nsISupports*, nsACString_internal const&, nsISupports*, PRInt16*), contentType=<value optimized out>, contentLocation=<value optimized out>, requestingLocation=0x19d60d0, requestingContext=0x1d549b0, mimeType=..., extra=0xc, decision=0x7ffff5e1ca3b) at nsContentPolicy.cpp:157 #2 0x00007ffff57cf7ef in nsContentPolicy::ShouldLoad (this=0x7fffffff6300, contentType=<value optimized out>, contentLocation=<value optimized out>, requestingLocation=<value optimized out>, requestingContext=<value optimized out>, mimeType=<value optimized out>, extra=0xc, decision=0x7ffff5e1ca3b) at nsContentPolicy.cpp:218 #3 0x00007ffff5e106de in NS_InvokeByIndex_P (that=0x7fffffff6300, methodIndex=30755248, paramCount=4125602640, params=0x19d60d0) at xptcinvoke_x86_64_unix.cpp:208 #4 0x00007ffff554e5d4 in XPCWrappedNative::CallMethod (ccx=..., mode=<value optimized out>) at xpcwrappednative.cpp:2722 #5 0x00007ffff5556947 in XPC_WN_CallMethod (cx=0xbb6760, obj=0x1fd89c0, argc=6, argv=0xc4d4c8, vp=<value optimized out>) at xpcwrappednativejsops.cpp:1740 #6 0x00007ffff6931d47 in js_Invoke (cx=0xbb6760, argc=28040096, vp=0xc4d4b8, flags=<value optimized out>) at jsinterp.cpp:1360 #7 0x00007ffff692221c in js_Interpret (cx=0xbb6760) at jsops.cpp:2240 #8 0x00007ffff693224d in js_Invoke (cx=0xbb6760, argc=28040096, vp=0xc4d2f8, flags=<value optimized out>) at jsinterp.cpp:1368 #9 0x00007ffff554b526 in nsXPCWrappedJSClass::CallMethod (this=0xbdc020, wrapper=<value optimized out>, methodIndex=<value optimized out>, info=0x932f20, nativeParams=<value optimized out>) at xpcwrappedjsclass.cpp:1696 #10 0x00007ffff5e112a9 in PrepareAndDispatch (self=0x13c73e0, methodIndex=<value optimized out>, args=0x7fffffff7850, gpregs=0x7fffffff77d0, fpregs=0x7fffffff7800) at xptcstubs_x86_64_linux.cpp:153 #11 0x00007ffff5e1076b in SharedStub () from /usr/lib64/xulrunner-1.9.2/libxul.so #12 0x00007ffff583d986 in nsEventListenerManager::HandleEventSubType(struct {...} *, nsIDOMEventListener *, nsIDOMEvent *, nsPIDOMEventTarget *, PRUint32) (this=<value optimized out>, aListenerStruct=0x13d0c58, aListener=0x13c73e0, aDOMEvent= 0x1c330b0, aCurrentTarget=0xe80810, aPhaseFlags=28040096) at nsEventListenerManager.cpp:1041 #13 0x00007ffff583dd50 in nsEventListenerManager::HandleEvent (this=<value optimized out>, aPresContext=<value optimized out>, aEvent=0x1bc3340, aDOMEvent=<value optimized out>, aCurrentTarget=<value optimized out>, aFlags=<value optimized out>, aEventStatus=0x7fffffff7bb8) at nsEventListenerManager.cpp:1147 #14 0x00007ffff5852b56 in nsEventTargetChainItem::HandleEvent (this=0xc4bc58, aVisitor=..., aFlags=2, aMayHaveNewListenerManagers=0) at nsEventDispatcher.cpp:246 #15 0x00007ffff5852d02 in nsEventTargetChainItem::HandleEventTargetChain (this=<value optimized out>, aVisitor=..., aFlags=6, aCallback=0x0, aMayHaveNewListenerManagers=30755248) at nsEventDispatcher.cpp:332 #16 0x00007ffff585330a in nsEventDispatcher::Dispatch (aTarget=<value optimized out>, aPresContext=<value optimized out>, aEvent=0x1bc3340, aDOMEvent=<value optimized out>, aEventStatus=0x7fffffff7d1c, aCallback=<value optimized out>, aTargets=0x0) at nsEventDispatcher.cpp:573 #17 0x00007ffff58535a6 in nsEventDispatcher::DispatchDOMEvent (aTarget=0x100c4f0, aEvent=<value optimized out>, aDOMEvent=0x1c330b0, aPresContext=0x0, aEventStatus=0x7fffffff7d1c) at nsEventDispatcher.cpp:636 #18 0x00007ffff57d585c in nsContentUtils::DispatchChromeEvent (aDoc=0x1786820, aTarget=<value optimized out>, aEventName=<value optimized out>, aCanBubble=<value optimized out>, aCancelable=<value optimized out>, aDefaultAction=0x0) at nsContentUtils.cpp:3257 #19 0x00007ffff585226f in nsPLDOMEvent::Run (this=0x1d54b40) at nsPLDOMEvent.cpp:62 #20 0x00007ffff5e057c7 in nsThread::ProcessNextEvent (this=0x6c9f30, mayWait=0, result=0x7fffffff7ddc) at nsThread.cpp:527 #21 0x00007ffff5dd9fb5 in NS_ProcessNextEvent_P (thread=0x7fffffff6300, mayWait=30755248) at nsThreadUtils.cpp:250 #22 0x00007ffff5d6cf5e in mozilla::ipc::MessagePump::Run (this=0x6bba70, aDelegate=0x6c7000) at MessagePump.cpp:110 #23 0x00007ffff5dae930 in MessageLoop::Run (this=0x6c7000) at ./src/base/message_loop.cc:173 #24 0x00007ffff5ce9029 in nsBaseAppShell::Run (this=0x7b1120) at nsBaseAppShell.cpp:174 #25 0x00007ffff5bba2d6 in nsAppStartup::Run (this=0x80cdb0) at nsAppStartup.cpp:183 #26 0x00007ffff5524803 in XRE_main (argc=<value optimized out>, argv=<value optimized out>, aAppData=<value optimized out>) at nsAppRunner.cpp:3483 #27 0x0000000000401c07 in main (argc=1, argv=0x7fffffffdf28) at nsXULStub.cpp:583 Additional information provided. As mentioned in the linked forum thread, a workaround this problem was found. Compiling xulrunner and firefox/thunderbird without -march=native fixes the problem. It appears to be affecting those with Sandy Bridge processors. (In reply to comment #5) > As mentioned in the linked forum thread, a workaround this problem was found. > Compiling xulrunner and firefox/thunderbird without -march=native fixes the > problem. It appears to be affecting those with Sandy Bridge processors. > Makes this a toolchain problem. echo "int main() { return 0; }" | gcc -march=native -v -E - 2>&1 | grep march (In reply to comment #7) > echo "int main() { return 0; }" | gcc -march=native -v -E - 2>&1 | grep march > /usr/libexec/gcc/x86_64-pc-linux-gnu/4.4.4/cc1 -E -quiet -v - -D_FORTIFY_SOURCE=2 -march=core2 -mcx16 -msahf -maes -mpclmul -mpopcnt -mavx --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=256 -mtune=core2 Can you start from "-O2 -march=core2 -mcx16 -msahf -maes -mpclmul -mpopcnt -mavx", make sure that fails, and then remove flags one by one off the back until it works? (it'll probably be -mavx) Confirmed on forums it's -mavx. Can someone try gcc-4.4.5 or gcc-4.5.2? Xulrunner stuff always seems to have issues with stack alignment (eg. bug #270120) and I know I've seen a few gcc bug reports regarding AVX alignment. (In reply to comment #10) > Confirmed on forums it's -mavx. Can someone try gcc-4.4.5 or gcc-4.5.2? Confirming for gcc-4.5.2. I'm using gcc-4.5.2 on a Sandy Bridge system and I have the same problem. Removing -mavx from the CFLAGS helps. BTW: It seems to be enough to rebuild xulrunner without -mavx, I did not have to rebuild firefox or thunderbird. If xulrunner has a history of problems with stack alignment, can't we just filter the -mavx cflag in the xulrunner ebuild? stack misalignment on x86 isnt specific to xulrunner. any package using zlib can hit it. (In reply to comment #13) > stack misalignment on x86 isnt specific to xulrunner. any package using zlib > can hit it. > Would you recommend a system-wide CFLAGS="-march=native -mnoavx -O2 -pipe" instead of CFLAGS="-march=native -O2 -pipe" as workaround for Sandy Bridge CPUs? Hit this one today after upgrading from firefox 3.6.15 to 3.6.16. firefox wouldn't start up, even with no profile or safe mode, silently exits. went back to 3.6.15, still broken. I have march=native with gcc 4.5.2 on sandy bridge. I put just -O2 -pipe in /etc/portage/env for xulrunner and firefox and recompiled and all back working with my original profile. Hi I can not confirm this on ARM: Yesterday I tried to emerge xulrunner-2.0.1 on my armv7 (cortex-a9) smartbook - Configure failed cause of -mno-avx while compile a conftest.c. I edited my temp/environment and removed the -mno-avx and run ebuild .. configure compile... ...success Test: # echo "int main(){return 0;}" | gcc -E -mno-avx - > /dev/null cc1: error: unrecognized command line option "-mno-avx" # gcc-config -l [1] armv7a-unknown-linux-gnueabi-4.4.5 * (In reply to comment #16) > I can not confirm this on ARM: Huh? ARM? AVX is an instruction set addition for new Intel x86 CPUs. I think you got confused. :) (In reply to comment #16) > Hi > > I can not confirm this on ARM: > > Yesterday I tried to emerge xulrunner-2.0.1 on my armv7 (cortex-a9) smartbook - > Configure failed cause of -mno-avx while compile a conftest.c. > I edited my temp/environment and removed the -mno-avx and run ebuild .. > configure compile... ...success > > Test: > # echo "int main(){return 0;}" | gcc -E -mno-avx - > /dev/null > cc1: error: unrecognized command line option "-mno-avx" > > # gcc-config -l > [1] armv7a-unknown-linux-gnueabi-4.4.5 * # Ensure we do not fail on i{3,5,7} processors that support -mavx if use amd64 || use x86; then append-flags -mno-avx fi Straight from the ebuild, if your still seeing a problem on arm then your tree is out of sync. # Ensure we do not fail on i{3,5,7} processors that support -mavx if use amd64 || use x86; then append-flags -mno-avx fi Adding that in xulrunner ebuild may be sufficient for Firefox, but for Thunderbird you also need to add it to its ebuild or it won't start (In reply to comment #19) > # Ensure we do not fail on i{3,5,7} processors that support -mavx > if use amd64 || use x86; then > append-flags -mno-avx > fi > > Adding that in xulrunner ebuild may be sufficient for Firefox, but for > Thunderbird you also need to add it to its ebuild or it won't start Thunderbird-3.3_alpha* already includes as well. *** Bug 356397 has been marked as a duplicate of this bug. ***
> # Ensure we do not fail on i{3,5,7} processors that support -mavx
> if use amd64 || use x86; then
> append-flags -mno-avx
> fi
>
> Straight from the ebuild, if your still seeing a problem on arm then your tree
> is out of sync.
If you're going to do that then make it dependent on >=4.4 so you don't break older versions (and preferably only on versions you know are broken).
*** Bug 373941 has been marked as a duplicate of this bug. *** *** Bug 373941 has been marked as a duplicate of this bug. *** Regarding specifically of the bug 373941, app-office/libreoffice-3.3.3 builds with "-mno-avx". This also affects GCC 4.4.6. By the way, a possible issue with -mno-avx is that -mavx disables SSE optimizations and it is uncertain (until someone examines GCC) that -mno-avx will turn them back on. It is probably better to specify optimizations manually: "-march=core2 -mcx16 -msahf -maes -mpclmul -msse4.1 -msse4.2 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=256 -mtune=core2" You can get GCC to show you what -march=native does with "gcc -march=native -v --help=target". Since -mavx is used, the "-msse4.1 -msse4.2" options will be missing. Correction, here are the optimizations you want (in addition to -O2 and -pipe): -march=core2 -mcx16 -msahf -maes -mpclmul -mpopcnt -msse4.1 -msse4.2 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=256 -mtune=core2 I forgot to include -mpopcnt. > "-march=core2 -mcx16 -msahf -maes -mpclmul -msse4.1 -msse4.2 --param > l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=256 > -mtune=core2" NB: this is processor-specific so people shouldn't blindly copy it. Also there are a few packages that call strip-flags that choke on --param options. > You can get GCC to show you what -march=native does with "gcc -march=native -v > --help=target". Since -mavx is used, the "-msse4.1 -msse4.2" options will be > missing. Use `echo "" | gcc -march=native -E - 2>&1 | grep cc1`. --help=target can miss some options enabled by other options because it's evaluated early. > By the way, a possible issue with -mno-avx is that -mavx disables SSE > optimizations and it is uncertain (until someone examines GCC) that -mno-avx > will turn them back on. So what does -march=native -mno-avx tell you? Nothing for mozilla team to do here, readd later if needed. re-open if it's still an issue with gcc-4.9+ |