Summary: | =www-client/chromium-28.0.1500.{11,20,36,45} - Many pages fail to load, bookmark manager broken; problem with seccomp filter sandbox. | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Chris Smith <chris> |
Component: | Current packages | Assignee: | Chromium Project <chromium> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | gentoo, timo.breitner, ua_gentoo_bugzilla |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | http://code.google.com/p/chromium/issues/detail?id=255063 | ||
Whiteboard: | ht-wanted | ||
Package list: | Runtime testing required: | --- | |
Attachments: |
Stack trace of deadlocked thread
chromium-gpu-sandbox-r0.patch |
Description
Chris Smith
2013-05-24 20:20:37 UTC
I am running google-chrome-28.0.1500.20 and all the pages you mentioned load and work fine for me. The bookmark manager works as well. Seems like the problems must be limited to chromium. (In reply to comment #1) > I am running google-chrome-28.0.1500.20 and all the pages you mentioned load > and work fine for me. The bookmark manager works as well. Seems like the > problems must be limited to chromium. It could be a >=dev-lang/v8-3.18 issue as well as these Chromium ebuilds require it. Don't really know except that they just don't work. I am running chromium Version 28.0.1500.20 (201172) and all of those pages plus the bookmark manager work fine. (In reply to Graham Murray from comment #3) > I am running chromium Version 28.0.1500.20 (201172) and all of those pages > plus the bookmark manager work fine. 64 bit? nVidia? same toolchain? must be some difference... Chromium is actually segfaulting: May 25 22:59:43 sartre kernel: [ 120.665471] Watchdog[2431]: segfault at 0 ip 00007fb2a3b9c16e sp 00007fb28c61b060 error 6 in chrome[7fb2a2e4f000+4855000] May 25 22:59:55 sartre kernel: [ 132.866345] Watchdog[2465]: segfault at 0 ip 00007f3a0c81616e sp 00007f39f5295060 error 6 in chrome[7f3a0bac9000+4855000] May 25 23:00:07 sartre kernel: [ 145.014030] Watchdog[2477]: segfault at 0 ip 00007f4ab523516e sp 00007f4a9dcb4060 error 6 in chrome[7f4ab44e8000+4855000] 100% failure when trying to load get.webgl.org. Previous version - 27.x works fine. If I set CHROMIUM_FLAGS="--disable-seccomp-filter-sandbox" in /etc/chromium/default the browser appears to work. It was a setting I needed some time ago but was able to do without until now. Seems to be some sort of regression. Same thing for me. Chromium 28.0.1500.20, nvidia-drivers 319.23, kernel 3.9.3 x64. Without --disable-seccomp-filter-sandbox pages seem to be fully loaded but it takes another ~10 seconds before it draws (until that page tab is white). [14417:14424:0526/091740:ERROR:gpu_watchdog_thread.cc(209)] The GPU process hung. Terminating after 10000 ms. Same problem here, on nvidia hardware/binary drivers. There is a significant delay before the first webpage is displayed after starting the browser. A lot of Watchdog crashes, as Chris reported. After the initial delay however, the browser seems to work fine. Bookmark manager, Google Maps, everything works. A sample backtrace from the crash: #0 0x00007f12d3ae0ae8 in content::GpuWatchdogThread::DeliberatelyTerminateToRecoverFromHang() [clone .part.8] () #1 0x00007f12d3b4ccd9 in base::MessageLoop::RunTask(base::PendingTask const&) () #2 0x00007f12d3b4e230 in base::MessageLoop::DeferOrRunPendingTask(base::PendingTask const&) () #3 0x00007f12d3b4f2b4 in base::MessageLoop::DoDelayedWork(base::TimeTicks*) () #4 0x00007f12d3b530f6 in base::MessagePumpDefault::Run(base::MessagePump::Delegate*) () #5 0x00007f12d3b52b62 in base::MessageLoop::RunInternal() () #6 0x00007f12d3b697a8 in base::RunLoop::Run() () #7 0x00007f12d3b4c2e5 in base::MessageLoop::Run() () #8 0x00007f12d3b7f341 in base::Thread::ThreadMain() () #9 0x00007f12d3b7a0d9 in base::(anonymous namespace)::ThreadFunc(void*) () #10 0x00007f12d292af3b in start_thread (arg=0x7f12bd178700) at pthread_create.c:308 #11 0x00007f12c65bb50d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Problem still exists with chromium-28.0.1500.36. (In reply to Chris Smith from comment #9) > Problem still exists with chromium-28.0.1500.36. FYI I've asked upstream sandbox experts for help, https://groups.google.com/a/chromium.org/d/msg/chromium-dev/Ax1d9pw02q0/nKzxY0QqCVYJ Still waiting for their response. It'd be interesting to check what on your system triggers the problem (it's still likely a bug, just trying to figure out how to repro). (In reply to Paweł Hajdan, Jr. from comment #10) > (In reply to Chris Smith from comment #9) > > Problem still exists with chromium-28.0.1500.36. > > FYI I've asked upstream sandbox experts for help, > https://groups.google.com/a/chromium.org/d/msg/chromium-dev/Ax1d9pw02q0/ > nKzxY0QqCVYJ > > Still waiting for their response. It'd be interesting to check what on your > system triggers the problem (it's still likely a bug, just trying to figure > out how to repro). Wish I knew what it was as well :-) Might be nVidia related. Looks like .45 is available - I'll see if that makes a difference. I'm currently using chromium-29 on a daily basis, but I don't recall having serious issues with chromium-28. I'm also using latest nvidia-drivers on a 3.9-series (3.9.2) kernel. Problem still exists with chromium-28.0.1500.45 If I don't pass the "--disable-seccomp-filter-sandbox" flag to Chromium it simply is worthless. Works fine with the flag (like 27.x does without it) but I think there's some sort of security issue involved in setting it. Just want to add that it may not be the total problem but one is certainly a breakage in OpenGL (WebGL?) - basically any page using it refuses to load properly, such as http://get.webgl.org/ . (In reply to Mike Gilbert from comment #12) > I'm currently using chromium-29 on a daily basis, but I don't recall having > serious issues with chromium-28. > > I'm also using latest nvidia-drivers on a 3.9-series (3.9.2) kernel. Just installed chromium-29.0.1521.3 to test and I still have the problem. It also needs the flag passed or pages are slow to load (the webgl page eventually loads but it takes some time). Currently using nvidia-drivers-319.23 and gentoo-sources-3.9.5. I have to revert to the 28. series as mongodb doesn't build with v8-3.19. - If you could run Chrome (not Chromium) dev channel, enable crash reports and give a crash ID, it would be very helpful. - Can you give the content of these two files on your system around the mentioned line numbers? #10 0x00007f12d292af3b in start_thread (arg=0x7f12bd178700) at pthread_create.c:308 #11 0x00007f12c65bb50d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Alternatively, provide a pointer to your exact glibc source. (In reply to Julien Tinnes from comment #16) > - If you could run Chrome (not Chromium) dev channel, enable crash reports > and give a crash ID, it would be very helpful. > - Can you give the content of these two files on your system around the > mentioned line numbers? > > #10 0x00007f12d292af3b in start_thread (arg=0x7f12bd178700) at > pthread_create.c:308 > #11 0x00007f12c65bb50d in clone () at > ../sysdeps/unix/sysv/linux/x86_64/clone.S:113 Locate doesn't find a file named pthread_create.c or clone.S on my system. > Alternatively, provide a pointer to your exact glibc source. Running sys-libs/glibc-2.17 in portage (~amd64). Not seeing any errors with www-client/google-chrome-29.0.1541.2_alpha207000 outside of these: ============================= NVIDIA: could not open the device file /dev/nvidia0 (Operation not permitted). NVIDIA: could not open the device file /dev/nvidia0 (Operation not permitted). ============================= # ls -al /dev/nvidia0 crw-rw---- 1 root video 195, 0 Jun 18 15:23 /dev/nvidia0 I had read that thread too fast. This is just the watchdog terminating the task. We still don't know what went wrong. If someone can reproduce with Chrome and submit a crash ID it'll helpful. Meanwhile, --disable-gpu-sandbox would be a better workaround than --disable-seccomp-filter-sandbox. (In reply to Julien Tinnes from comment #18) > Meanwhile, --disable-gpu-sandbox would be a better workaround than > --disable-seccomp-filter-sandbox. It'd be useful to get confirmation from someone who can repro this bug whether the above workaround works. (In reply to Paweł Hajdan, Jr. from comment #19) > (In reply to Julien Tinnes from comment #18) > > Meanwhile, --disable-gpu-sandbox would be a better workaround than > > --disable-seccomp-filter-sandbox. > > It'd be useful to get confirmation from someone who can repro this bug > whether the above workaround works. Crazy - after installing Chrome to test with I no longer get the error with Chromium ?? I now get with both the: NVIDIA: could not open the device file /dev/nvidia0 (Operation not permitted). NVIDIA: could not open the device file /dev/nvidia0 (Operation not permitted). errors. But webgl is now working in both without disabling sandboxing. (In reply to Chris Smith from comment #20) > Crazy - after installing Chrome to test with I no longer get the error with > Chromium ?? Just wondering - what happens when you reboot and try Chromium without using Chrome before? Or does uninstalling Chrome make the error go away (could be a hardcoded path)? This could > I now get with both the: > NVIDIA: could not open the device file /dev/nvidia0 (Operation not > permitted). > NVIDIA: could not open the device file /dev/nvidia0 (Operation not > permitted). I think these can be ignored. Created attachment 351832 [details]
Stack trace of deadlocked thread
Hi, after a couple of tests I now think this bug is only triggered when _not_ using tcmalloc, which is the default case on Gentoo due to known issues (#413637). Enabling/unmasking tcmalloc makes the bug disappear, as does LD_PRELOADing libtcmalloc.so from google-perftools.
For an explanation please have a look at the attached stack trace. It shows a (GPU process) thread which, I think, deadlocks during a free() call:
glibc's free() implementation tries to open() a file (/proc/sys/vm/overcommit_memory) which is trapped by the sandbox. A "broker process" then matches the file name against a white-list with the help of some STL magic. A hereby (implicitly) created std::string internally calls new() and hence malloc()....
There is a clear relation to nvidia binary drivers, however this seems to be a coincidence, since the actual free() call happens inside libX11. And for all I can tell pretty much any call to (glibc's) free() could trigger the deadlock.
Anyway, using tcmalloc (in combination with nvidia drivers) is not a good idea as long as the issue mentioned above is unresolved, so the currently best (i.e. least invasive) workaround indeed seems to be "--disable-gpu-sandbox" (which I can confirm to be working as expected).
On a side note: /dev/nvidia0 is not on the sandbox' white-list, which explains the error messages ("NVIDIA: could not open the device file....").
Hope this helps.
(In reply to Timo Breitner from comment #23) > For an explanation please have a look at the attached stack trace. It shows > a (GPU process) thread which, I think, deadlocks during a free() call: > glibc's free() implementation tries to open() a file > (/proc/sys/vm/overcommit_memory) which is trapped by the sandbox. A "broker > process" then matches the file name against a white-list with the help of > some STL magic. A hereby (implicitly) created std::string internally calls > new() and hence malloc().... Thank you for a truly excellent analysis. This helps a lot. Upstream is now tracking this issue. Created attachment 352106 [details, diff]
chromium-gpu-sandbox-r0.patch
Please test the following patch (successfully applies against chromium-28.0.1500.52).
This is intended to be a workaround, not a fix.
I have found that using --reduce-gpu-sandbox also "fixes" the issue in my case. I can't find specifics on the differences between this flag and --disable-gpu-sandbox, though. The only information that I found was that the former (--reduce...) makes the GPU sandbox less strict. I have applied the r0 patch to chromium-28.0.1500.89, and it eliminates the symptoms of the problem. As Paweł mentioned, this is a workaround, but at least it is better than disabling all GPU sandboxing within Chromium. Thank you, Paweł. + 17 Jul 2013; Mike Gilbert <floppym@gentoo.org> + +files/chromium-bug471198.patch, chromium-28.0.1500.71.ebuild, + chromium-28.0.1500.89.ebuild: + Apply upstream fix for bug 471198. |