Summary: | app-admin/rasdaemon-0.6.8-r1 crashes on startup | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Gabriele Svelto <gabriele.svelto> |
Component: | Current packages | Assignee: | Gentoo's Team for Core System packages <base-system> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | prometheanfire, sam, vidra.jonas |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://github.com/mchehab/rasdaemon/issues/77 https://github.com/mchehab/rasdaemon/pull/93 https://bugs.debian.org/1054152 https://bugs.gentoo.org/show_bug.cgi?id=922061 |
||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: |
`emerge --info` output
Output of a rasdaemon-0.8.0 crash under Valgrind |
Description
Gabriele Svelto
2023-01-09 10:20:17 UTC
Are you sure it's crashing in that thread and not another one? Could you share the output of "bt full"? (In reply to Sam James from comment #1) > Are you sure it's crashing in that thread and not another one? > > Could you share the output of "bt full"? (+ emerge --info please) Created attachment 848018 [details]
`emerge --info` output
I've double-checked and this is indeed the crashing thread, this is the output of `bt full`: #0 ___pthread_mutex_lock (mutex=0x7473656d6974202c) at pthread_mutex_lock.c:80 type = <optimized out> __PRETTY_FUNCTION__ = "___pthread_mutex_lock" id = <optimized out> #1 0x00007ffff7ed379c in sqlite3_finalize (pStmt=0x7fffa400e9b0) at sqlite3.c:87444 v = 0x7fffa400e9b0 db = 0x7fffa400f310 rc = <optimized out> #2 0x00005555555687f2 in ras_mc_event_closedb (cpu=17, ras=<optimized out>) at ras-record.c:923 rc = <optimized out> db = 0x7fffa4001c40 priv = 0x7fffa4001bf0 __func__ = "ras_mc_event_closedb" #3 0x0000555555564698 in handle_ras_events_cpu (priv=0x5555555c4cf0) at ras-events.c:608 fd = 38 kbuf = 0x7fffa4001b80 page = 0x7fffa4000b70 pipe_raw = "per_cpu/cpu17/trace_pipe_raw", '\000' <repeats 4067 times> pdata = <optimized out> #4 0x00007ffff7ce337a in start_thread (arg=<optimized out>) at pthread_create.c:442 ret = <optimized out> pd = <optimized out> out = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737488347696, -2861193018973524210, 140736204891840, 2, 140737350873264, 140736196501504, 2861024794797764366, 2861175029158997774}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> #5 0x00007ffff7d6422c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 No locals. Also this: (gdb) p $_siginfo._sifields._sigfault.si_addr $7 = (void *) 0x0 Looks like a NULL pointer access. An extra bit of information, I was wrong about the crash not presenting itself when compiling the package with -O0. It still happens, just takes a while longer this issue might be timing-dependent and it doesn't look like it's specific to Gentoo. I've tried with a plain build of the upstream sources and I can still repro. I'll bring this into the bug tracker for the upstream package. Thanks, please throw a link here when you do. My guess is https://github.com/mchehab/rasdaemon/issues/77. The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=c5bc82ad10a33da634522bae36d22966485ffbb3 commit c5bc82ad10a33da634522bae36d22966485ffbb3 Author: Sam James <sam@gentoo.org> AuthorDate: 2023-02-19 18:37:38 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2023-02-19 18:37:47 +0000 app-admin/rasdaemon: add 0.8.0 Closes: https://bugs.gentoo.org/890286 Signed-off-by: Sam James <sam@gentoo.org> app-admin/rasdaemon/Manifest | 1 + .../files/rasdaemon-0.8.0-bashisms-configure.patch | 40 +++++++++++ app-admin/rasdaemon/rasdaemon-0.8.0.ebuild | 83 ++++++++++++++++++++++ 3 files changed, 124 insertions(+) Sorry, I mixed up libtracefs/libtraceevent. The new version uses an unbundled, much newer copy of libtraceevent. As for your bug, see https://github.com/mchehab/rasdaemon/issues/77#issuecomment-1399202752. I'm getting the same bug (identical stacktrace) on app-admin/rasdaemon-0.8.0 as well, with an underlying configuration listed as problematic in the linked bug (AMD CPU with _SC_NPROCESSORS_CONF != _SC_NPROCESSORS_ONLN). I'm attaching a log from running rasdaemon under Valgrind, it seems to come down to a use-after-free bug. I'll comment on the Github issue as well. Created attachment 858929 [details]
Output of a rasdaemon-0.8.0 crash under Valgrind
Might also be related to https://github.com/mchehab/rasdaemon/pull/93. The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=758c24a6578bad541a188f0fe513906515dd1bda commit 758c24a6578bad541a188f0fe513906515dd1bda Author: Sam James <sam@gentoo.org> AuthorDate: 2023-12-29 00:22:14 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2023-12-29 00:22:14 +0000 app-admin/rasdaemon: backport crash for online vs. configured CPUs Closes: https://bugs.gentoo.org/890286 Signed-off-by: Sam James <sam@gentoo.org> ...on-0.8.0-check-online-cpus-not-configured.patch | 40 +++++ ...rasdaemon-0.8.0-table-create-offline-cpus.patch | 179 +++++++++++++++++++++ app-admin/rasdaemon/rasdaemon-0.8.0-r2.ebuild | 87 ++++++++++ 3 files changed, 306 insertions(+) Fingers crossed that does it. Let me know if it doesn't though... (Also, sorry I didn't backport that *far* sooner. I was hoping for a new release with it and completely forgot.) (In reply to Sam James from comment #14) > Fingers crossed that does it. Let me know if it doesn't though... I just tested the new version (rasdaemon-0.8.0-r2) and it works for me, so it seems the bug is fixed. Thank you! It's working fine on my box too, thank you! |