Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 932307 - sci-geosciences/gpsd gpsmon general protection fault [...] libncurses.so.6.4[7f651d6bb000+1f000]
Summary: sci-geosciences/gpsd gpsmon general protection fault [...] libncurses.so.6.4[...
Status: CONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal minor (vote)
Assignee: Sci-geo Project
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2024-05-20 14:57 UTC by János Tóth F.
Modified: 2024-06-28 08:57 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description János Tóth F. 2024-05-20 14:57:56 UTC
After some recent changes I made:

- there was a profile switch from 17 to 23 (followed by a split->merged user transititon)
- I swtiched from -O2 to -O3 (globally)
- I enabled LTO (globally for all CFLAGS's and via useflags)

1: gpsd partially stops working after some hours.
rc-service status gpsd
tells me it's running (didn't crash) but Chrony looses the SOCKS interface (last seen several hours in the past).

2: gpsmon instantly crashs with segfault and prints this to dmesg:
traps: gpsmon[26821] general protection fault ip:7f95b4b87985 sp:7ffc765bfe30 error:0 in libncurses.so.6.4[7f95b4b79000+1f000]

cgps always works but I only have the GPS module for Chrony time sync, so I don't care about that.

My current workaround ia restarting gpsd in every hour with cron but it this doesn't fix gpsmon, only gpsd and Chrony's SOCKS readings from gpsd.

I can't tell if this is a gpsd or an ncurses issues or a globally enabled -O3 LTO thing. I tred compiling gpsd and ncurses selectively with -O2 and withouth LTO (throuh portage enviroment exceptions) but thingas stayed the same (this makes me think that this is not an -O3 and/or LTO issues but an ncurses or may be a gpsd issue).
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-20 15:02:36 UTC
Could you try to grab a backtrace from gpsmon crashing please? (See https://wiki.gentoo.org/wiki/Debugging).

It will also help if, in the worst case if we get stuck, you can try reproduce it in a clean stage3 and tell us the minimal number of changes needed to make it happen.
Comment 2 Greg Kubaryk 2024-05-21 01:39:25 UTC
I'm marking this as confirmed because gpsmon crashes here as well; it's not -O3 or LTO. Issues with gpsmon are immediate and repeatable, and while I do occasionally restart gpsd it's nowhere near as often as original reporter.
Comment 3 Greg Kubaryk 2024-05-21 02:20:48 UTC
Program received signal SIGSEGV, Segmentation fault.
                                                    cannot_delete (win=0x5555556436a0) at /var/tmp/portage/sys-libs/ncurses-6.4_p20240414/work/ncurses-6.4/ncurses/base/lib_delwin.c:61
warning: 61	/var/tmp/portage/sys-libs/ncurses-6.4_p20240414/work/ncurses-6.4/ncurses/base/lib_delwin.c: No such file or directory
(gdb) bt
#0  cannot_delete (win=0x5555556436a0)
    at /var/tmp/portage/sys-libs/ncurses-6.4_p20240414/work/ncurses-6.4/ncurses/base/lib_delwin.c:61
#1  delwin (win=0x5555556436a0) at /var/tmp/portage/sys-libs/ncurses-6.4_p20240414/work/ncurses-6.4/ncurses/base/lib_delwin.c:83
#2  0x000055555555c0eb in switch_type (devtype=<optimized out>) at gpsd-3.25/gpsmon/gpsmon.c:521
#3  0x000055555555c46a in select_packet_monitor (device=0x5555555fb440 <session>) at gpsd-3.25/gpsmon/gpsmon.c:571
#4  gpsmon_hook (device=device@entry=0x5555555fb440 <session>, changed=changed@entry=78533923717582390)
    at gpsd-3.25/gpsmon/gpsmon.c:818
#5  0x000055555558dbcb in gpsd_multipoll (data_ready=<optimized out>, device=device@entry=0x5555555fb440 <session>, 
    handler=handler@entry=0x55555555c2a0 <gpsmon_hook>, reawake_time=reawake_time@entry=0) at gpsd-3.25/gpsd/libgpsd_core.c:2078
#6  0x0000555555559fca in main (argc=<optimized out>, argv=<optimized out>) at gpsd-3.25/gpsmon/gpsmon.c:1464
(gdb)
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-22 00:05:31 UTC
Can you try valgrind and Asan+ubsan after?
Comment 5 Greg Kubaryk 2024-05-22 01:25:15 UTC
$ valgrind -s gpsmon
==31557== Memcheck, a memory error detector
==31557== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==31557== Using Valgrind-3.23.0 and LibVEX; rerun with -h for copyright info
==31557== Command: gpsmon
==31557== 
gpsmon: assertion failure, probable I/O error
==31557== 
==31557== HEAP SUMMARY:
==31557==     in use at exit: 135,745 bytes in 545 blocks
==31557==   total heap usage: 667 allocs, 122 frees, 209,410 bytes allocated
==31557== 
==31557== LEAK SUMMARY:
==31557==    definitely lost: 0 bytes in 0 blocks
==31557==    indirectly lost: 0 bytes in 0 blocks
==31557==      possibly lost: 402 bytes in 6 blocks
==31557==    still reachable: 135,343 bytes in 539 blocks
==31557==         suppressed: 0 bytes in 0 blocks
==31557== Rerun with --leak-check=full to see details of leaked memory
==31557== 
==31557== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
==31557== 
==31557== 1 errors in context 1 of 2:
==31557== Invalid read of size 8
==31557==    at 0x49A2909: _nc_screen_of (lib_data.c:297)
==31557==    by 0x497872C: cannot_delete (lib_delwin.c:58)
==31557==    by 0x497872C: delwin (lib_delwin.c:83)
==31557==    by 0x1100EA: switch_type (gpsmon.c:521)
==31557==    by 0x110469: select_packet_monitor (gpsmon.c:571)
==31557==    by 0x110469: gpsmon_hook (gpsmon.c:818)
==31557==    by 0x141BCA: gpsd_multipoll (libgpsd_core.c:2078)
==31557==    by 0x10DFC9: main (gpsmon.c:1464)
==31557==  Address 0x4bf4b18 is 8 bytes inside a block of size 104 free'd
==31557==    at 0x4847B8F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==31557==    by 0x497FBB4: _nc_freewin (lib_newwin.c:124)
==31557==    by 0x1100D2: switch_type (gpsmon.c:517)
==31557==    by 0x110469: select_packet_monitor (gpsmon.c:571)
==31557==    by 0x110469: gpsmon_hook (gpsmon.c:818)
==31557==    by 0x141BCA: gpsd_multipoll (libgpsd_core.c:2078)
==31557==    by 0x10DFC9: main (gpsmon.c:1464)
==31557==  Block was alloc'd at
==31557==    at 0x484C393: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==31557==    by 0x497FCC5: _nc_makenew_sp (lib_newwin.c:287)
==31557==    by 0x497FE85: newwin_sp (lib_newwin.c:161)
==31557==    by 0x110106: switch_type (gpsmon.c:522)
==31557==    by 0x110469: select_packet_monitor (gpsmon.c:571)
==31557==    by 0x110469: gpsmon_hook (gpsmon.c:818)
==31557==    by 0x141BCA: gpsd_multipoll (libgpsd_core.c:2078)
==31557==    by 0x10DFC9: main (gpsmon.c:1464)
==31557== 
==31557== 
==31557== 1 errors in context 2 of 2:
==31557== Invalid read of size 1
==31557==    at 0x497870E: cannot_delete (lib_delwin.c:53)
==31557==    by 0x497870E: delwin (lib_delwin.c:83)
==31557==    by 0x1100EA: switch_type (gpsmon.c:521)
==31557==    by 0x110469: select_packet_monitor (gpsmon.c:571)
==31557==    by 0x110469: gpsmon_hook (gpsmon.c:818)
==31557==    by 0x141BCA: gpsd_multipoll (libgpsd_core.c:2078)
==31557==    by 0x10DFC9: main (gpsmon.c:1464)
==31557==  Address 0x4bf4b2c is 28 bytes inside a block of size 104 free'd
==31557==    at 0x4847B8F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==31557==    by 0x497FBB4: _nc_freewin (lib_newwin.c:124)
==31557==    by 0x1100D2: switch_type (gpsmon.c:517)
==31557==    by 0x110469: select_packet_monitor (gpsmon.c:571)
==31557==    by 0x110469: gpsmon_hook (gpsmon.c:818)
==31557==    by 0x141BCA: gpsd_multipoll (libgpsd_core.c:2078)
==31557==    by 0x10DFC9: main (gpsmon.c:1464)
==31557==  Block was alloc'd at
==31557==    at 0x484C393: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==31557==    by 0x497FCC5: _nc_makenew_sp (lib_newwin.c:287)
==31557==    by 0x497FE85: newwin_sp (lib_newwin.c:161)
==31557==    by 0x110106: switch_type (gpsmon.c:522)
==31557==    by 0x110469: select_packet_monitor (gpsmon.c:571)
==31557==    by 0x110469: gpsmon_hook (gpsmon.c:818)
==31557==    by 0x141BCA: gpsd_multipoll (libgpsd_core.c:2078)
==31557==    by 0x10DFC9: main (gpsmon.c:1464)
==31557== 
==31557== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-05-24 01:21:41 UTC
Maybe Gary can help wrt the gpsd/gpsmon issue (possibly just in terms of what is to blame)...
Comment 7 Gary E. Miller 2024-05-24 03:58:46 UTC
I see you are using version 3.25.  That is known not to work with gcc 14.

That problem is fixed in git head, and it will need a release soon.

gpsd has never supported -O3 or LTO.

gpsmon is deprecated and unmaintained.  gpsmon was a tool for developers, never
intended for end users.  I suggest you not use gpsmon at all.  Users think it is a gpsd client, but it is not, gpsmon bypasses gpsd.

I'll look more Friday.

Can you try git head?
Comment 8 Gary E. Miller 2024-05-24 06:33:48 UTC
I think this commit fixed the gpsmon double free:

commit bc840b0d3ba65d3d8fe2b7faeadd5af5ed2b5e60
Author: Boian Bonev <bbonev@ipacct.com>
Date:   Fri Nov 10 22:13:26 2023 +0000

Someonoe could backport the patch, but best to just not use gpsmon.
Comment 9 János Tóth F. 2024-05-24 12:37:27 UTC
(In reply to Gary E. Miller from comment #8)
> Someonoe could backport the patch, but best to just not use gpsmon.

Ok, but Chronyc keeps loosing the SOCK source in roughly 24 hours without gpsd crashing (cpgs works). I thought these two issues might be linked because these started happening at the same time.
I am using GCC 14.1.1 (from the Gentoo source repo).
Comment 10 Gary E. Miller 2024-05-24 20:25:03 UTC
> Ok, but Chronyc keeps loosing the SOCK source in roughly 24 hours without gpsd crashing (cpgs works).

gpsd and gpsmon share no code.  gpsmon is not a client of gpsd.  Two almost
unrelated programs.  So this bug is about two unrelated things.

The title of this is "gpsmon general protection fault", not "Chronyc loosing".

Problems like your chronyd problem usually fall into one, or more of:

1) gpsd misconfiguration.

2) GNSS receiver/antenna issues

3) flakey serial ports.

4) using unsupported build options: -O3, LTO, etc.

Those are best handled not in this issue about a known GPF.

You may file a new Gentoo issue, but likely Gentoo has no way to help you with it.

I would suggest you either take your problem to #gpsd on librechat, file
and issue at https://gitlab.com/gpsd/gpsd/, or ask for help on the gpsd email list:  https://lists.nongnu.org/mailman/listinfo/gpsd-users

Then you be asked to submit the output of gpsdebuginfo, run as root.  You can get a copy here:  https://gpsd.io/gpsdebuginfo

Taking the issue elsewhere will reduce the noise from this directed at the wonderful Gentoo maintainers and place it where it belongs, on the gpsd maintainers.  If we find the ebuild needs to be changed to prevent using unsupported options like -O3, then we can create a new issue for that here.