Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 933110 - net-misc/nextcloud-client-3.12.3: segmentation fault when LTO is enabled
Summary: net-misc/nextcloud-client-3.12.3: segmentation fault when LTO is enabled
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal normal
Assignee: Bernard Cafarelli
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: lto
  Show dependency tree
 
Reported: 2024-05-29 08:22 UTC by S. Martindale
Modified: 2025-04-26 11:18 UTC (History)
5 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info net-misc/nextcloud-client (nextcloud-client-3.12.3-emerge-info.txt,7.99 KB, text/plain)
2024-05-29 08:22 UTC, S. Martindale
Details

Note You need to log in before you can comment on or make changes to this bug.
Description S. Martindale 2024-05-29 08:22:18 UTC
Created attachment 894595 [details]
emerge --info net-misc/nextcloud-client

With USE="lto", `net-misc/nextcloud-client` builds but fails at runtime with a segmentation fault.

I tested the following versions:

- net-misc/nextcloud-client-3.12.3 (which is currently masked under ~amd64)
- net-misc/nextcloud-client-3.11.1 (currently showing as stable in Portage)

Both exhibited the problem. I could not test 3.13.0 because of #930943.

The runtime seg. fault ceases to occur if LTO is disabled for the `net-misc/nextcloud-client` package by adding `-Wno-error=odr -Wno-error=lto-type-mismatch -Wno-error=strict-aliasing -fno-lto` as recommended in the wiki. (Since it is a runtime fault, presumably only the last of those flags is salient.)

`emerge --info` output is attached, with the fix applied. The only difference between the failure case and those shown is the addition of the no-lto flags, above.
Comment 1 Kostadin Shishmanov 2024-06-02 13:30:02 UTC
There is an upstream bug for this but it was closed due to inactivity: https://github.com/nextcloud/desktop/issues/2790

See also: 
https://github.com/nextcloud/desktop/issues/3090
https://github.com/nextcloud/desktop/issues/4924
Comment 2 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-17 16:20:38 UTC
(In reply to Kostadin Shishmanov from comment #1)
> There is an upstream bug for this but it was closed due to inactivity:
> https://github.com/nextcloud/desktop/issues/2790
> 

The backtrace on the upstream bug is:
```
Starting program: /usr/bin/nextcloud --version
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7fffdf728700 (LWP 343396)]

Thread 1 "nextcloud" received signal SIGSEGV, Segmentation fault.
doActivate<false> (sender=0x0, signal_index=9, argv=argv@entry=0x7fffffffbf10) at kernel/qobject.cpp:3768
3768    kernel/qobject.cpp: No such file or directory.
(gdb) bt full
#0  doActivate<false> (sender=0x0, signal_index=9, argv=argv@entry=0x7fffffffbf10) at kernel/qobject.cpp:3768
        sp = <optimized out>
        signal_spy_set = <optimized out>
        empty_argv = {0x0}
        senderDeleted = <optimized out>
#1  0x00007fffebba1860 in QMetaObject::activate (sender=<optimized out>, m=m@entry=0x7fffeca9a0a0 <QGuiApplication::staticMetaObject>, local_signal_index=local_signal_index@entry=1, argv=argv@entry=0x7fffffffbf10) at kernel/qobject.cpp:3946
        signal_index = <optimized out>
#2  0x00007fffec521522 in QGuiApplication::screenAdded (this=<optimized out>, _t1=<optimized out>, _t1@entry=0x555555961950) at .moc/moc_qguiapplication.cpp:389
        _a = {0x0, 0x7fffffffbf08}
#3  0x00007fffec50746c in QWindowSystemInterface::handleScreenAdded (ps=ps@entry=0x55555588fd10, isPrimary=<optimized out>) at kernel/qwindowsysteminterface.cpp:826
        screen = 0x555555961950
#4  0x00007fffdfc86e00 in QXcbConnection::initializeScreens (this=this@entry=0x55555586e640) at qxcbconnection_screens.cpp:413
        screen = 0x55555588fd10
        __for_range = @0x55555586e930: {<QListSpecialMethods<QXcbScreen*>> = {<No data fields>}, {p = {static shared_null = {ref = {atomic = {_q_value = {<std::__atomic_base<int>> = {static _S_alignment = 4, _M_i = -1}, static is_always_lock_free = true}}}, alloc = 0, 
                begin = 0, end = 0, array = {0x0}}, d = 0x5555558a1140}, d = 0x5555558a1140}}
        __for_begin = {i = <optimized out>}
        __for_end = {i = <optimized out>}
        it = {data = 0x555555878034, rem = 0, index = 12192}
        xcbScreenNumber = <optimized out>
        primaryScreen = 0x55555588fd10
#5  0x00007fffdfc62390 in QXcbConnection::QXcbConnection (this=0x55555586e640, nativeInterface=<optimized out>, canGrabServer=<optimized out>, defaultVisualId=<optimized out>, displayName=<optimized out>) at qxcbconnection.cpp:103
        focusInDelay = <optimized out>
        focusInDelay = <optimized out>
#6  0x00007fffdfc65113 in QXcbIntegration::QXcbIntegration (this=0x55555586e520, parameters=..., argc=@0x7fffffffdb7c: 2, argv=<optimized out>) at qxcbintegration.cpp:197
        displayName = <optimized out>
        noGrabArg = <optimized out>
        doGrabArg = <optimized out>
        underDebugger = <optimized out>
        conn = 0x0
        numParameters = 0
        canNotGrabEnv = false
        displayName = <optimized out>
        noGrabArg = <optimized out>
        doGrabArg = <optimized out>
        underDebugger = <optimized out>
        numParameters = <optimized out>
        conn = <optimized out>
        j = <optimized out>
        i = <optimized out>
        arg = {d = <optimized out>}
        ok = <optimized out>
        i = <optimized out>
        display = {static null = {<No data fields>}, d = <optimized out>}
        qt_category_enabled = <optimized out>
        qt_category_enabled = <optimized out>
#7  0x00007ffff7fc446f in QXcbIntegrationPlugin::create (this=<optimized out>, system=..., argv=0x7fffffffddd8, argc=@0x7fffffffdb7c: 2, parameters=...) at qxcbmain.cpp:56
        xcbIntegration = <optimized out>
[...]
```

It indeed looks the same as bug 754021. It also (somewhat) resolves a question I had about bug 754021 wrt why Debian hadn't hit it (I guess they *had*, or could at least) and just didn't notice because of the other issues with LTO so they stopped using it for Wireshark (bug 941890).
Comment 3 Larry the Git Cow gentoo-dev 2025-04-18 04:23:32 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=fad8ff8a45afc83559f8df695cf96dfec51d3e8a

commit fad8ff8a45afc83559f8df695cf96dfec51d3e8a
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2025-04-18 04:21:42 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2025-04-18 04:23:01 +0000

    net-analyzer/wireshark: fix runtime with LTO
    
    Qt's qcompilerdetection.h currently checks for whether -fPIE is being used
    along with QT_USE_PROTECTED_VISIBILITY ("reduce relocations", which Qt
    automatically uses if supported). It bails out if -fPIE is used, as -fPIC
    is required instead.
    
    If LTO is used, when one does something like:
    (1) g++ -c -flto -fPIC qtlto.cc
    (2) g++     -pie -fPIE qtlto.o -o qtlto
    
    At point (1), the Qt check in the headers fires, and everything is fine,
    because we're indeed using -fPIC, and GCC doesn't automatically add -fPIE
    when built with --enable-default-pie if -fPIC is present on the command line.
    
    GCC may apply optimisations at this point given Qt is using -mno-direct-extern-access
    and it was built with -fPIC not -fPIE.
    
    Later, at point (2), -fPIE is passed. This happens in Wireshark because
    `CMAKE_POSITION_INDEPENDENT_CODE` gets set in CMakeLists.txt. With LTO,
    there's no opportunity for the Qt sanity check in headers to fire again,
    as everything is already long-preprocessed and GCC will have applied some
    optimisations already assuming the -fPIC code model in (1). But as slyfox
    says at https://bugs.gentoo.org/754021#c12, GCC merges -fPIC -fPIE to -fPIE
    at LTO-time (-fPIC coming from the earlier LTO object in (1), and -fPIE
    was just-passed on the command line).
    
    qtlto (or Wireshark) then crashes. For Wireshark, this looks like:
    ```
     #0  0x00007ff40e529cf0 in QScopedPointer<QObjectData, QScopedPointerDeleter<QObjectData> >::get (this=<optimized out>)
         at /usr/src/debug/dev-qt/qtbase-6.8.3/qtbase-everywhere-src-6.8.3/src/corelib/tools/qscopedpointer.h:112
     #1  qGetPtrHelper<QScopedPointer<QObjectData, QScopedPointerDeleter<QObjectData> > > (ptr=<optimized out>)
         at /usr/src/debug/dev-qt/qtbase-6.8.3/qtbase-everywhere-src-6.8.3/src/corelib/global/qtclasshelpermacros.h:128
     #2  QObject::d_func (this=<optimized out>) at /usr/src/debug/dev-qt/qtbase-6.8.3/qtbase-everywhere-src-6.8.3/src/corelib/kernel/qobject.h:108
     #3  QObjectPrivate::get (o=<optimized out>) at /usr/src/debug/dev-qt/qtbase-6.8.3/qtbase-everywhere-src-6.8.3/src/corelib/kernel/qobject_p.h:150
     #4  doActivate<false> (sender=0x0, signal_index=9, argv=argv@entry=0x7ffe59a73c30) at /usr/src/debug/dev-qt/qtbase-6.8.3/qtbase-everywhere-src-6.8.3/src/corelib/kernel/qobject.cpp:4003
     #5  0x00007ff40e4d2809 in QMetaObject::activate
         (sender=<optimized out>, m=m@entry=0x7ff40f44f6c0 <QGuiApplication::staticMetaObject>, local_signal_index=local_signal_index@entry=1, argv=argv@entry=0x7ffe59a73c30)
         at /usr/src/debug/dev-qt/qtbase-6.8.3/qtbase-everywhere-src-6.8.3/src/corelib/kernel/qobject.cpp:4183
     #6  0x00007ff40ead5676 in QGuiApplication::screenAdded (this=<optimized out>, _t1=<optimized out>)
    [...]
    ```
    
    We need to drop -fPIE somehow at link-time accordingly. There's a few
    ways of doing this but I've gone for not calling `check_pie_supported()`
    (see (7) below).
    
    (Analysis on fixing this in other packages may depend on whether any static
    libraries *installed* by CMake where -fPIC was no longer passed for those,
    we would have a problem. I'd tried to use POSITION_INDEPENDENT_CODE at first
    but then -fPIC gets dropped as well everywhere, and setting the target
    property to false for just the Wireshark executable also doesn't work
    because it'll pass -no-pie which isn't what we want.)
    
    There are some questions:
    (3) Why doesn't this happen with Clang, given that Clang has -fno-direct-access-external-data
        (equivalent to GCC's -mno-direct-extern-access), even when Qt is built
        with bfd (not lld)?
    
        The answer seems to be that Clang doesn't implement the optimisation
        yet to avoid copy-relocations where possible. GCC implemented that in
        5.x in r5-5573-g77ad54d911dd7c.
    
    (4) Why doesn't this (seem to) happen in other distributions?
    
        nextcloud-client suffers from the same issue analysed here, see
        https://bugs.gentoo.org/933110. The upstream bug at https://github.com/nextcloud/desktop/issues/2790
        was reported by a Debian developer (cgzones), so it's a reasonable assumption
        that it can happen on Debian.
    
        Debian is one of few distributions (we're another) to use --enable-default-pie
        in GCC rather than just passing it to all package builds in the package manager:
        it's possible that some distros are just disabling -fPIE or adding a workaround
        like we did for https://bugs.gentoo.org/552440. Not many distros build
        with LTO either.
    
        Debian also stopped building Wireshark with LTO because of a bug in Wireshark
        itself (https://bugs.gentoo.org/941890), so I guess they disabled LTO
        and didn't notice this crash.
    
        (This is enough for me to be more confident in my analysis, anyway.)
    
    (5) Could Qt communicate this somehow automatically?
    
        I think it might be able to if statically linking Qt and Qt was built
        with LTO.
    
        Otherwise, I think the only option would be an ELF .note. pkg-config
        could maybe work but you can't assume all Qt consumers use that...
    
        See the discussion around <https://bugreports.qt.io/browse/QTBUG-45755?focusedId=282483&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-282483>:
        > Thiago Macieira added a comment - 22 May '15 17:11
        > There aren't that many autoconf-based Qt5 builds and we've never exported the flag anyway.
    
        (It might be worth bringing this .note idea up to Thiago and/or H.J. but
        I'm not sure yet if it'll work.)
    
        On the Qt side, -fPIC gets passed in to various places before because
        Qt's CMake config files have INTERFACE_COMPILE_OPTIONS w/ -fPIC. Maybe
        the answer is for Qt packages to never use CMAKE_POSITION_INDEPENDENT_CODE
        instead. This came up in https://gitlab.kitware.com/cmake/cmake/-/issues/15570.
    
    (6) Could we just disable "reduce relocations" in Qt itself, given that
        the workaround here will need to be applied in various Qt consumers?
    
        This would significantly impact startup times of applications using Qt
        and there don't seem to be too many applications doing this (only 2
        known so far in Gentoo: Wireshark and nextcloud-client).
    
    (7) Is the mechanism used to fix this brittle?
    
        Yes, we're relying on a CMake bug/feature for now at https://gitlab.kitware.com/cmake/cmake/-/issues/25588
        so it doesn't try to enable *or* disable PIE at link-time and we can
        just rely on our toolchain defaults.
    
    Thanks to Arusekk for producing a minimal example and reporting it upstream
    to Wireshark, thanks to slyfox for analysing the interaction with LTO, thanks
    to Holger for the discussion around it and testing, and thanks to Eli for
    reviewing the commit message.
    
    Bug: https://bugs.gentoo.org/552440
    Bug: https://bugs.gentoo.org/754021
    Bug: https://bugs.gentoo.org/933110
    Bug: https://bugs.gentoo.org/941890
    Bug: https://gitlab.kitware.com/cmake/cmake/-/issues/15570
    Bug: https://gitlab.kitware.com/cmake/cmake/-/issues/25588
    Bug: https://gitlab.kitware.com/cmake/cmake/-/issues/23980
    Bug: https://gitlab.com/wireshark/wireshark/-/issues/17040
    Bug: https://bugreports.qt.io/browse/QTBUG-45755
    Bug: https://bugreports.qt.io/browse/QTBUG-47942
    Bug: https://gcc.gnu.org/PR65248
    Bug: https://gcc.gnu.org/PR65886
    Thanks-to: Arusekk <arek_koz@o2.pl>
    Thanks-to: Sergei Trofimovich <slyfox@gentoo.org>
    Thanks-to: Holger Hoffstätte <holger@applied-asynchrony.com>
    Thanks-to: Eli Schwartz <eschwartz@gentoo.org>
    Signed-off-by: Sam James <sam@gentoo.org>

 net-analyzer/wireshark/files/4.4.6-lto.patch       | 164 +++++++++++++++++++++
 ...hark-4.4.6.ebuild => wireshark-4.4.6-r1.ebuild} |  11 +-
 net-analyzer/wireshark/wireshark-9999.ebuild       |  11 +-
 3 files changed, 175 insertions(+), 11 deletions(-)
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-18 04:24:52 UTC
I want to let that fix soak a bit in Wireshark, then will propagate it to nextcloud-client.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-18 04:30:31 UTC
On the upstream bug, cgzones says it works for him now (https://github.com/nextcloud/desktop/issues/2790#issuecomment-2507442295).

That seems to be because Debian now disables reduce_relocations: https://salsa.debian.org/qt-kde-team/qt6/qt6-base/-/blob/master/debian/rules?ref_type=heads#L69 since https://salsa.debian.org/qt-kde-team/qt6/qt6-base/-/commit/4b71aae2212e06853eef4af78a6ff5054b686d19.

Apparently done in https://bugs.debian.org/1059249.

Wonder if maybe that's a combination of GCC built with --enable-default-pie (great, but means some issues go unnoticed) but also perhaps -fPIE being passed in dpkg build flags?
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2025-04-18 04:31:26 UTC
... and for Qt 5, Debian weren't passing that or disabling it: https://salsa.debian.org/qt-kde-team/qt/qtbase/-/blob/master/debian/rules?ref_type=heads.