Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 923013

Summary: sys-apps/sandbox: dev-cpp/gtest death tests die with SIGSEGV due to stack overconsumption in sandbox
Product: Portage Development Reporter: Sv. Lockal <lockalsash>
Component: SandboxAssignee: Sandbox Maintainers <sandbox>
Status: RESOLVED FIXED    
Severity: normal CC: eike, ionen, lockalsash, plevine457, proxy-maint, qt, sam, vowstar
Priority: Normal Keywords: PullRequest
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=629620
https://bugs.gentoo.org/show_bug.cgi?id=688162
https://bugs.gentoo.org/show_bug.cgi?id=692464
https://bugs.gentoo.org/show_bug.cgi?id=918664
https://bugs.gentoo.org/show_bug.cgi?id=915695
https://bugs.gentoo.org/show_bug.cgi?id=553918
https://github.com/gentoo/sandbox/pull/26
https://bugs.gentoo.org/show_bug.cgi?id=922886
Whiteboard:
Package list:
Runtime testing required: ---

Description Sv. Lockal 2024-01-27 10:42:54 UTC
Hi, there is still an ongoing issue similar to https://bugs.gentoo.org/629620, https://bugs.gentoo.org/688162 and https://bugs.gentoo.org/692464. Previously it was fixed by patches in gtest, but at this moment no patches are applied to gtest and it fails again.

This issue affects dev-libs/oneDNN-3.3.3 tests, for example.

https://github.com/google/googletest/blob/v1.14.0/googletest/src/gtest-death-test.cc#L1307

On x86-64 gtest still allocates 8192 bytes for `clone`:

static pid_t ExecDeathTestSpawnChild(char* const* argv, int close_fd) {
    const auto stack_size = static_cast<size_t>(getpagesize() * 2);
    ...
    child_pid = clone(&ExecDeathTestChildMain, stack_top, SIGCHLD, &args);


After that small ExecDeathTestChildMain function in gtest calls execv, which is intercepted by libsandbox.so, which allocates 8192 + more bytes multiple times on stack, causing SIGSEGV (instead of expected types of crashes).

Increasing multiplier in gtest to bigger one helps, but I suggest this time to fix it on sandbox level to decrease maintenance burden on gtest and make sandbox more transparent. I'll provide a patch that allows execv_DEFAULT to fit 8192 bytes.
Comment 1 Sv. Lockal 2024-01-27 10:51:58 UTC
PR submitted: https://github.com/gentoo/sandbox/pull/26
Comment 2 Ionen Wolkens gentoo-dev 2024-01-27 15:54:50 UTC
Assume this should allow to drop the patch on qtbase eventually too (this was previously improved upstream but SIGSTKSZ aka 8192 was likewise found to be still too small for some people with our sandbox, so been patching it downstream to a safer 32768).

Amazing that it still worked for "most" people (me included) despite the original value of 4096 though (was never able to reproduce the segfaults making it a pain).
Comment 3 Larry the Git Cow gentoo-dev 2024-01-27 18:04:59 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/sandbox.git/commit/?id=1f7d3654498e17e0a91c83f57e6265e08628d5fe

commit 1f7d3654498e17e0a91c83f57e6265e08628d5fe
Author:     Sv. Lockal <lockalsash@gmail.com>
AuthorDate: 2024-01-27 10:44:55 +0000
Commit:     Mike Gilbert <floppym@gentoo.org>
CommitDate: 2024-01-27 18:04:16 +0000

    Fix SIGSEGV in gtest death tests due to small stack
    
    In https://github.com/google/googletest/blob/v1.14.0/googletest/src/gtest-death-test.cc#L1307
    on x86-64 gtest sallocates 8192 bytes for `clone`:
    
    ```
    static pid_t ExecDeathTestSpawnChild(char* const* argv, int close_fd) {
        const auto stack_size = static_cast<size_t>(getpagesize() * 2);
        ...
        child_pid = clone(&ExecDeathTestChildMain, stack_top, SIGCHLD, &args);
    ```
    
    After that attempt to call execv is intercepted by libsandbox.so, which
    allocates 8192 + more bytes multiple times on stack, causing SIGSEGV
    (instead of expected types of crashes).
    
    This PR moves all allocations for related function to heap, so now
    call path fits `getpagesize() * 2` with large margin.
    
    Bug: https://bugs.gentoo.org/923013
    Closes: https://github.com/gentoo/sandbox/pull/26
    Signed-off-by: Sv. Lockal <lockalsash@gmail.com>
    Signed-off-by: Mike Gilbert <floppym@gentoo.org>

 libsandbox/libsandbox.c | 34 +++++++++++++++++++++++++++++-----
 1 file changed, 29 insertions(+), 5 deletions(-)
Comment 4 Larry the Git Cow gentoo-dev 2024-01-27 18:12:51 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=3831293313552e0635385333e6d51604a02d732c

commit 3831293313552e0635385333e6d51604a02d732c
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-01-27 16:04:03 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-01-27 18:10:34 +0000

    dev-qt/qtbase: note the sandbox bug in childstack patch
    
    Not that need to be in a hurry to tentatively drop this (harmless),
    esp. given (unlike gtest) this affects building and not just tests
    and need to wait for most users to be updated after release+stable.
    Just so do not forget entirely.
    
    Bug: https://bugs.gentoo.org/923013
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 dev-qt/qtbase/files/qtbase-6.6.1-forkfd-childstack-size.patch | 4 ++++
 1 file changed, 4 insertions(+)
Comment 5 Sv. Lockal 2024-01-27 18:37:45 UTC
Ionen Wolkens, when I encountered it in gtest/oneDNN (https://bugs.gentoo.org/922886) it was reproducible with ~90% rate. I suppose, as stack area was aligned by 64 with size of 2*4096 bytes, and real stack was larger, but not too much, libsandbox wrote some data over the buffer but in some cases did not cause page fault (randomly).

Note that I did not test my patch against Qt. I added fix specifically for execv, but if Qt uses something else, it won't work (but at a glance it looks the same).
Comment 6 Ionen Wolkens gentoo-dev 2024-01-27 19:06:04 UTC
(In reply to Sv. Lockal from comment #5)
> Note that I did not test my patch against Qt. I added fix specifically for
> execv, but if Qt uses something else, it won't work (but at a glance it
> looks the same).
Yeah, I had a quick look too and as far as I can tell the problematic bit is only used by startProcess() which calls either execv or execve (the EXEC_MY_ENV differences seem harmless for execve).
Comment 7 Larry the Git Cow gentoo-dev 2024-06-27 18:39:35 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=23af26a6cc7fa2537e7f7d5ea4a65c7ad68c5eee

commit 23af26a6cc7fa2537e7f7d5ea4a65c7ad68c5eee
Author:     Mike Gilbert <floppym@gentoo.org>
AuthorDate: 2024-06-27 18:38:15 +0000
Commit:     Mike Gilbert <floppym@gentoo.org>
CommitDate: 2024-06-27 18:39:32 +0000

    sys-apps/sandbox: add 2.39
    
    Closes: https://bugs.gentoo.org/923013
    Closes: https://bugs.gentoo.org/921581
    Signed-off-by: Mike Gilbert <floppym@gentoo.org>

 sys-apps/sandbox/Manifest            |  1 +
 sys-apps/sandbox/sandbox-2.39.ebuild | 64 ++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)