647704 – dev-db/mariadb[galera] with sys-apps/iproute2-4.14.1 - ss[19047]: segfault at 0 ip 00005636dda9fe48 sp 00007ffe065c6070 error 4 in ss[5636dda94000+1c000]

Bug 647704 - dev-db/mariadb[galera] with sys-apps/iproute2-4.14.1 - ss[19047]: segfault at 0 ip 00005636dda9fe48 sp 00007ffe065c6070 error 4 in ss[5636dda94000+1c000]

Summary: dev-db/mariadb[galera] with sys-apps/iproute2-4.14.1 - ss[19047]: segfault at...

Status:	RESOLVED UPSTREAM

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	Normal normal
Assignee:	Gentoo's Team for Core System packages

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2018-02-15 09:36 UTC by Tomáš Mózes
Modified:	2019-09-25 14:08 UTC (History)
CC List:	1 user (show)

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Tomáš Mózes 2018-02-15 09:36:20 UTC

After upgrading from iproute2-4.4.0 to iproute2-4.14.1 (both -r2 and -r4), my MariaDB galera gluster started to act weird, it was starting a bit longer than usual.

The kernel log had the following entries:
Feb 15 08:46:38 db3a kernel: [6668513.173873] ss[18846]: segfault at 0 ip 000055ece8827e48 sp 00007fffb69b9160 error 4 in ss[55ece881c000+1c000]
Feb 15 08:46:41 db3a kernel: [6668516.692326] show_signal_msg: 16 callbacks suppressed
Feb 15 08:46:41 db3a kernel: [6668516.692335] ss[19047]: segfault at 0 ip 00005636dda9fe48 sp 00007ffe065c6070 error 4 in ss[5636dda94000+1c000]
Feb 15 08:46:41 db3a kernel: [6668516.899094] ss[19061]: segfault at 0 ip 0000560019d09e48 sp 00007ffd6e6abd60 error 4 in ss[560019cfe000+1c000]
Feb 15 08:46:42 db3a kernel: [6668517.103536] ss[19070]: segfault at 0 ip 00005560c544be48 sp 00007ffec55c6690 error 4 in ss[5560c5440000+1c000]
Feb 15 08:46:42 db3a kernel: [6668517.308507] ss[19079]: segfault at 0 ip 000055ee89eb5e48 sp 00007ffce4549200 error 4 in ss[55ee89eaa000+1c000]
Feb 15 08:46:42 db3a kernel: [6668517.516238] ss[19085]: segfault at 0 ip 0000562d0efbae48 sp 00007ffdce303ca0 error 4 in ss[562d0efaf000+1c000]
Feb 15 08:46:42 db3a kernel: [6668517.721847] ss[19107]: segfault at 0 ip 0000560ce68a2e48 sp 00007ffcee6aa8c0 error 4 in ss[560ce6897000+1c000]
Feb 15 08:46:42 db3a kernel: [6668517.926715] ss[19110]: segfault at 0 ip 00005564af34ae48 sp 00007fff40df18f0 error 4 in ss[5564af33f000+1c000]
Feb 15 08:46:43 db3a kernel: [6668518.133456] ss[19122]: segfault at 0 ip 00005639d1354e48 sp 00007ffe1c568700 error 4 in ss[5639d1349000+1c000]
Feb 15 08:46:43 db3a kernel: [6668518.339973] ss[19137]: segfault at 0 ip 000055b9956f9e48 sp 00007ffc57cf6ee0 error 4 in ss[55b9956ee000+1c000]
Feb 15 08:46:43 db3a kernel: [6668518.548122] ss[19149]: segfault at 0 ip 000055eb059c7e48 sp 00007ffcc5172560 error 4 in ss[55eb059bc000+1c000]
Feb 15 08:46:47 db3a kernel: [6668522.057396] show_signal_msg: 16 callbacks suppressed
Feb 15 08:46:47 db3a kernel: [6668522.057404] ss[19396]: segfault at 0 ip 000055d88c8bbe48 sp 00007fff412fef50 error 4 in ss[55d88c8b0000+1c000]

Either reverting to version 4.4.0 or upgrading to 4.15.0 solved the problem.

There are a few ss crashes mentioned in the 4.15.0 changelog (https://lkml.org/lkml/2018/1/29/357). I know this is not a mariadb bug, but just letting you know if anybody runs into the same problem.

Comment 1 Thomas Deutschmann (RETIRED) gentoo-dev

2018-02-15 13:23:55 UTC

Maybe you are able to compile with debug symbols and provide a core dump/backtrace?

Comment 2 Tomáš Mózes 2018-02-16 14:53:03 UTC

With this upstream patch it does not segfault (it's in the 4.15.0 release):

commit ebbb219c924ccedbc59e209d40b77d5dbeecd7cd
Author: Antonio Quartulli <a@unstable.cc>
Date:   Sun Jan 7 02:31:50 2018 +0800

    ss: fix NULL pointer access when parsing unix sockets with oldformat
    
    When parsing and printing the unix sockets in unix_show(),
    if the oldformat is detected, the peer_name member of the sockstat
    object is left uninitialized (NULL).
    For this reason, if a filter has been specified on the command line,
    a strcmp() will crash when trying to access it.
    
    Avoid crash by checking that peer_name is not NULL before
    passing it to strcmp().
    
    Cc: Stefano Brivio <sbrivio@redhat.com>
    Cc: Stephen Hemminger <stephen@networkplumber.org>
    Signed-off-by: Antonio Quartulli <a@unstable.cc>
    Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
    Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

diff --git a/misc/ss.c b/misc/ss.c
index b35859dc..29a25070 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -3711,7 +3711,10 @@ static int unix_show(struct filter *f)
                        };
 
                        memcpy(st.local.data, &u->name, sizeof(u->name));
-                       if (strcmp(u->peer_name, "*"))
+                       /* when parsing the old format rport is set to 0 and
+                        * therefore peer_name remains NULL
+                        */
+                       if (u->peer_name && strcmp(u->peer_name, "*"))
                                memcpy(st.remote.data, &u->peer_name,
                                       sizeof(u->peer_name));
                        if (run_ssfilter(f->f, &st) == 0) {

Comment 3 Thomas Deutschmann (RETIRED) gentoo-dev

2019-09-25 14:08:06 UTC

Closing this one: Patch is in >=iproute2-4.15 and oldest version in repository is 4.19.