Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 937773 - gui-libs/egl-wayland-1.1.15: causes crashes under nvidia-drivers
Summary: gui-libs/egl-wayland-1.1.15: causes crashes under nvidia-drivers
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Ionen Wolkens
URL: https://github.com/NVIDIA/egl-wayland...
Whiteboard:
Keywords: PATCH
Depends on:
Blocks:
 
Reported: 2024-08-11 17:55 UTC by John M. Harris, Jr.
Modified: 2024-08-23 05:13 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John M. Harris, Jr. 2024-08-11 17:55:25 UTC
With gui-libs/egl-wayland-1.1.15, kmail crashes under nvidia-drivers with the following error:

kmail: ../egl-wayland-1.1.15/src/wayland-eglsurface.c:2562: wlEglDestroySurface: Assertion `wl_list_empty(&surface->ctx.streamImages)' failed.

Reverting to gui-libs/egl-wayland-1.1.13.1 resolved this.
Comment 1 Alexandre Ferreira 2024-08-11 21:36:59 UTC
Same here. I am using nvidia-drivers-560-31.02 with following configuration:

[ebuild   R   *] x11-drivers/nvidia-drivers-560.31.02:0/560::gentoo  USE="X kernel-open modules static-libs strip tools wayland -dist-kernel -modules-compress -modules-sign -persistenced -powerd" ABI_X86="32 (64)" 0 KiB
Comment 2 Ionen Wolkens gentoo-dev 2024-08-11 22:45:06 UTC
Alternatively you could also downgrade nvidia-drivers, explicit sync only gets enabled when *unkeyworded beta* nvidia-drivers-560 is combined with >=egl-wayland-1.1.14.

You can also force-disable explicit sync with __NV_DISABLE_EXPLICIT_SYNC=1.

If it still happens with 550 or 555 branch then something else may be happening.

Ideally should report this to egl-wayland upstream w/ backtraces. As for here, will debate what to do about this when nvidia-drivers-560 is due to be keyworded (aka non-beta) if issues are still persisting may either mask, or set __NV_DISABLE_EXPLICIT_SYNC=1 in env.d.

May possibly be a continuation of https://github.com/NVIDIA/egl-wayland/issues/111 albeit 1.1.15 was supposed to have fixed a few of these.
Comment 3 Ionen Wolkens gentoo-dev 2024-08-11 23:07:27 UTC
(In reply to Ionen Wolkens from comment #2)
> Ideally should report this to egl-wayland upstream w/ backtraces. As for
> here, will debate what to do about this when nvidia-drivers-560 is due to be
> keyworded (aka non-beta) if issues are still persisting may either mask, or
> set __NV_DISABLE_EXPLICIT_SYNC=1 in env.d.
Actually, maybe will do latter now. Could one of you confirm whether setting __NV_DISABLE_EXPLICIT_SYNC=1 prevents crashes when using 560+1.1.15? Not sure if need to set it before start the session or just before starting kmail is enough.

This may also be a qtwayland bug. Explicit sync been exposing a few issues that isn't directly related to egl-wayland/nvidia (e.g. firefox fixed one), albeit until it gets sorted out may be better keeping it disabled.
Comment 4 Ionen Wolkens gentoo-dev 2024-08-11 23:27:18 UTC
(In reply to Ionen Wolkens from comment #3)
> (In reply to Ionen Wolkens from comment #2)
> > Ideally should report this to egl-wayland upstream w/ backtraces. As for
> > here, will debate what to do about this when nvidia-drivers-560 is due to be
> > keyworded (aka non-beta) if issues are still persisting may either mask, or
> > set __NV_DISABLE_EXPLICIT_SYNC=1 in env.d.
> Actually, maybe will do latter now. Could one of you confirm whether setting
> __NV_DISABLE_EXPLICIT_SYNC=1 prevents crashes when using 560+1.1.15? Not
> sure if need to set it before start the session or just before starting
> kmail is enough.
Might consider USE=experimental-explicit-sync to easily re-enable and leave it off by default.
Comment 5 John M. Harris, Jr. 2024-08-12 00:48:57 UTC
(In reply to Ionen Wolkens from comment #3)
> Actually, maybe will do latter now. Could one of you confirm whether setting
> __NV_DISABLE_EXPLICIT_SYNC=1 prevents crashes when using 560+1.1.15? Not
> sure if need to set it before start the session or just before starting
> kmail is enough.

I upgraded to 1.1.15 again, with nvidia-drivers-560. Didn't seem to work either way.

I can only reproduce this in applications using QtWebEngine, so this very well may be a qtwayland bug.
Comment 6 Ionen Wolkens gentoo-dev 2024-08-12 01:22:11 UTC
(In reply to John M. Harris, Jr. from comment #5)
> (In reply to Ionen Wolkens from comment #3)
> > Actually, maybe will do latter now. Could one of you confirm whether setting
> > __NV_DISABLE_EXPLICIT_SYNC=1 prevents crashes when using 560+1.1.15? Not
> > sure if need to set it before start the session or just before starting
> > kmail is enough.
> 
> I upgraded to 1.1.15 again, with nvidia-drivers-560. Didn't seem to work
> either way.
By "either way" you mean whether __NV_DISABLE_EXPLICIT_SYNC=1 is set or not, or something else?

That aside, may mask it given I'm reading about other issues like VRAM leak too.
Comment 7 Larry the Git Cow gentoo-dev 2024-08-12 01:36:43 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=8a285dc079f9abbbb18f897551760135ec2cbf3a

commit 8a285dc079f9abbbb18f897551760135ec2cbf3a
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-08-12 01:30:52 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-08-12 01:36:21 +0000

    profiles: mask >=gui-libs/egl-wayland-1.1.14 again
    
    Problems with firefox that led to masking 1.1.14 seem
    solved with recent firefox+1.1.15, but that doesn't
    seem to be the end of issues, let's keep it masked for
    a while still.
    
    Alternatively could try __NV_DISABLE_EXPLICIT_SYNC=1
    but it is currently unclear if this really solves all
    issues.
    
    Bug: https://bugs.gentoo.org/937773
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 profiles/package.mask | 10 ++++++++++
 1 file changed, 10 insertions(+)
Comment 8 John M. Harris, Jr. 2024-08-12 07:02:08 UTC
(In reply to Ionen Wolkens from comment #6)
> (In reply to John M. Harris, Jr. from comment #5)
> > (In reply to Ionen Wolkens from comment #3)
> > > Actually, maybe will do latter now. Could one of you confirm whether setting
> > > __NV_DISABLE_EXPLICIT_SYNC=1 prevents crashes when using 560+1.1.15? Not
> > > sure if need to set it before start the session or just before starting
> > > kmail is enough.
> > 
> > I upgraded to 1.1.15 again, with nvidia-drivers-560. Didn't seem to work
> > either way.
> By "either way" you mean whether __NV_DISABLE_EXPLICIT_SYNC=1 is set or not,
> or something else?
> 
> That aside, may mask it given I'm reading about other issues like VRAM leak
> too.

I tried both setting the environmental variable before starting the application and before starting my session. Neither one worked.
Comment 9 Ionen Wolkens gentoo-dev 2024-08-13 05:05:53 UTC
(In reply to John M. Harris, Jr. from comment #8)
> I tried both setting the environmental variable before starting the
> application and before starting my session. Neither one worked.
I see, thanks, that's too bad.
Comment 10 Alexandre Ferreira 2024-08-14 15:11:17 UTC
The patch at https://github.com/NVIDIA/egl-wayland/pull/131/commits/90c5bdbb8d0552c31830eab9f187a1381f73fdd4.patch fix the problem for me.
Comment 11 Ionen Wolkens gentoo-dev 2024-08-14 15:42:25 UTC
Good to know for anyone that want to try it locally. Will probably leave it masked rather than do backports and wait for 1.1.16 to try unmasking again at this point.
Comment 12 Larry the Git Cow gentoo-dev 2024-08-23 05:13:59 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=60cb92ff4fedcd9a3dd08794daa863afcf7e1d10

commit 60cb92ff4fedcd9a3dd08794daa863afcf7e1d10
Author:     Ionen Wolkens <ionen@gentoo.org>
AuthorDate: 2024-08-23 05:09:57 +0000
Commit:     Ionen Wolkens <ionen@gentoo.org>
CommitDate: 2024-08-23 05:13:18 +0000

    gui-libs/egl-wayland: add 1.1.16
    
    Albeit not lifting the mask yet, been going back&forth with
    this too much and it is likely that there is other issues
    (plus official nvidia-drivers still ships with 1.1.13.1).
    
    Feel free to unmask if want to try.
    
    Closes: https://bugs.gentoo.org/937773
    Signed-off-by: Ionen Wolkens <ionen@gentoo.org>

 gui-libs/egl-wayland/Manifest                  |  1 +
 gui-libs/egl-wayland/egl-wayland-1.1.16.ebuild | 42 ++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)