Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 543668

Summary: sys-libs/uclibc-0.9.33.2-r14 segfault in dlclose()
Product: Gentoo Linux Reporter: David Flogeras <dflogeras2>
Component: [OLD] Core systemAssignee: Anthony Basile <blueness>
Status: RESOLVED OBSOLETE    
Severity: normal CC: embedded
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 570544    
Attachments: Workaround for segfault
valgrind log from segfault

Description David Flogeras 2015-03-18 01:18:55 UTC
When running app-admin/syslog-ng-3.6.2 on a uClibc based raspberry pi, I get a reliable segfault when syslog is unloading its plugins (through glib's g_module_close).  I'll attach a backtrace as a comment.

Reproducible: Always
Comment 1 David Flogeras 2015-03-18 01:31:58 UTC
I rebuilt syslog-ng, glib, and uClibc with debugging symbols and stepped through with gdb.  If I let it simply crash, the stack is an unusable mess.  This is the stack trace directly before the crash.  It is the second library that is being unloaded (/usr/lib/syslog-ng/libsyslog-ng-crypto.so).  The first one (/usr/lib/syslog-ng/libdbparser.so) unloaded without event.  If I step into _dl_run_fini_array(tpnt); on line 843 it segfaults.

#0  do_dlclose (vhandle=0x3e680, need_fini=1) at ldso/libdl/libdl.c:843
#1  0xb6c73f38 in dlclose (vhandle=0x3e680) at ldso/libdl/libdl.c:1063
#2  0xb6ceb2a8 in _g_module_close (handle=0x3e680, is_unref=1) at /var/tmp/porta
ge/dev-libs/glib-2.40.2/work/glib-2.40.2/gmodule/gmodule-dl.c:136
#3  0xb6cec224 in g_module_close (module=0x46240) at /var/tmp/portage/dev-libs/g
lib-2.40.2/work/glib-2.40.2/gmodule/gmodule.c:759
#4  0xb6f7ac0c in plugin_load_candidate_modules (cfg=0x19010) at lib/plugin.c:46
5
#5  0xb6f4fe48 in cfg_load_candidate_modules (self=0x19010) at lib/cfg.c:389
#6  0xb6f530f0 in cfg_lexer_lex (self=0x21f10, yylval=0xbeffd238, yylloc=0xbeffd
25c) at lib/cfg-lexer.c:897
#7  0xb6f548a0 in main_lex (yylval=0xbeffd238, yylloc=0xbeffd25c, lexer=0x21f10)
 at lib/cfg-parser.c:173
#8  0xb6f90e9c in main_parse (lexer=0x21f10, dummy=0xbefff274, arg=0x0) at lib/c
fg-grammar.c:3041
#9  0xb6f4eca0 in cfg_parser_parse (self=0xb6fedb08 <main_parser>, lexer=0x21f10
, instance=0xbefff274, arg=0x0) at ./lib/cfg-parser.h:83
#10 0xb6f4fd80 in cfg_run_parser (self=0x19010, lexer=0x21f10, parser=0xb6fedb08
 <main_parser>, result=0xbefff274, arg=0x0) at lib/cfg.c:371
#11 0xb6f5002c in cfg_read_config (self=0x19010, fname=0x17830 "/etc/syslog-ng/s
yslog-ng.conf", syntax_only=1, preprocess_into=0x0) at lib/cfg.c:443
#12 0xb6f71400 in main_loop_read_and_init_config () at lib/mainloop.c:445
#13 0x00009650 in main (argc=1, argv=0xbefff4d4) at syslog-ng/main.c:248


It is calling the dtor of syslog-ng-3.6.2/lib/crypto.c, which in turn uses _dl_linux_resolve, and eventually crashes inside _dl_find_hash (uClibc-0.9.33.2/ldso/ldso/arm/elfinterp.c:72).  I don't pretent to try to understand the inner workings.  Here is the backtrace at the segfault

#0  _dl_lookup_sysv_hash (type_class=<optimized out>, undef_name=<optimized out>
, hash=252420148, symtab=0xb6b62f38, tpnt=0x3e540) at ldso/ldso/dl-hash.c:260
#1  _dl_find_hash (name=name@entry=0xb6b551f8 "crypto_deinit", scope=<optimized
out>, mytpnt=0x44890, type_class=type_class@entry=1, sym_ref=sym_ref@entry=0x0)
at ldso/ldso/dl-hash.c:339
#2  0xb6ff22fc in _dl_linux_resolver (tpnt=<optimized out>, reloc_entry=<optimiz
ed out>) at ldso/ldso/arm/elfinterp.c:72
#3  0xb6ff6584 in _dl_linux_resolve () at ldso/ldso/arm/resolve.S:126
#4  0xb6ff6584 in _dl_linux_resolve () at ldso/ldso/arm/resolve.S:126
#5  0xb6ff6584 in _dl_linux_resolve () at ldso/ldso/arm/resolve.S:126
.....
Comment 2 David Flogeras 2015-03-18 01:47:20 UTC
The following compiles and runs without issue on the same system:

#include <dlfcn.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

int main() {
  void* v = dlopen( "/usr/lib/syslog-ng/libsyslog-ng-crypto.so", RTLD_NOW );
  if( ! v ) fprintf( stderr, "failed to open" );
  if( 0 != dlclose( v )) {
    fprintf( stderr, "%s\n", strerror( errno ));
  }
  return 0;
}


compiled with:

gcc test.c -ldl -ggdb -lglib-2.0 -levtlog
Comment 3 David Flogeras 2015-03-18 14:53:27 UTC
It appears that at the time of the segfault, in ldso/ldso/dl-hash.c:296 it is walking the linked list "scope". For each entry in scope, it walks anosther linked list loop_scope->r_list.

It walked over the first two elements of the outer loop, calling _dl_lookup_sysv_hash() (line 339) each time through.  On the third iteration, the 0th element of loop_scope->r_list has libname = "" which is pretty suspicious.

I'm not sure if the memory got clobbered, and valgrind does not support armv6j.
Comment 4 David Flogeras 2015-03-18 20:23:40 UTC
This seems unrelated to ARM which is good.  Here's how you can reproduce in the comfort of your own home.

1 Boot a VM (I used virtualbox) with gentoo livecd YMMV.
2 Partition, and unpack the latest amd64-uclibc-vanilla stage3.  No need to install a kernel since you can just chroot in.  Do that now.
3 Install some dev tools, I put [c]gdb valgrind vim strace etc.
4 Edit make.conf to use FEATURES="nostrip" and CFLAGS="-ggdb -fno-omit-frame-pointer -pipe" (no optimization)
5 emerge/re-emerge syslog-ng and it's deps.
6 using ebuild, prepare the sources for uclibc, but reconfigure it and turn on debugging information in its menuconfig.  Also turn on the debugging info related to dlopen.
7 merge the modified, debug version of uclibc

Since uclibc doesn't seem to support elfutils, you cannot use splitdebug/installsources.  Instead just unpack the pertinent sources in /var/tmp/portage: 'ebuild /usr/portage/CAT/PKG/PKG.ebuild prepare".  I did this for uclibc, glib, and syslog-ng.

Run syslog in the debugger, start with "[c]gdb syslog-ng", and start it inside gdb with "r -s -f /etc/syslog-ng/syslog-ng.conf" (This is now rc invokes it).  It will segfault in _dl_lookup_gnu_hash().
Comment 5 David Flogeras 2015-03-19 13:26:21 UTC
Created attachment 399244 [details, diff]
Workaround for segfault

blueness showed me how to work around this issue.  Here's a patch that removes the call to unmap the mapped region.  This is NOT a fix, but may help someone to diagnose.
Comment 6 David Flogeras 2015-03-19 13:27:51 UTC
Created attachment 399246 [details]
valgrind log from segfault

Here's a log showing the invalid memory accesses prior to segfaulting.  This is using glib-2.42.2 uclibc-0.9.33.2-r14 and syslog-ng-3.6.2 on amd64-uclibc-vanilla.
Comment 7 Anthony Basile gentoo-dev 2015-03-20 18:15:23 UTC
(In reply to David Flogeras from comment #5)
> Created attachment 399244 [details, diff] [details, diff]
> Workaround for segfault
> 
> blueness showed me how to work around this issue.  Here's a patch that
> removes the call to unmap the mapped region.  This is NOT a fix, but may
> help someone to diagnose.

Actually it is a fix because POSIX does not mandate that dlclose() actually do the unmappings.  It may, but it need not.
Comment 8 Anthony Basile gentoo-dev 2015-03-20 19:22:40 UTC
@Dave.  Rich Felker suggested the bug might be in the dynamic linker and not in dlclose() per se.  Can you try this patch and see if it fixes things:

http://git.alpinelinux.org/cgit/aports/commit/main/libc0.9.32/uclibc-dlclose-fix.patch?h=2.7-stable&id=d36e402fae2b31ca2bf6eafbafa77d716ea99b15

Of course, undo my patch.
Comment 9 David Flogeras 2015-03-20 20:40:30 UTC
Rebuilt with just the above patch.  Segfaults in the same spot as originally.

Also, I should clarify comment #6.  The valgrind output happens when I comment out the call to unmap as well.  It just doesn't segfault.
Comment 10 David Flogeras 2015-03-27 19:54:14 UTC
#include <dlfcn.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>

int main() {
  void* v = dlopen( "/usr/lib/syslog-ng/libdbparser.so", RTLD_LAZY | RTLD_GLOBAL );
  if( ! v ) fprintf( stderr, "failed to open" );
  if( 0 != dlclose( v )) {
    fprintf( stderr, "%s\n", strerror( errno ));
  }
  return 0;
}



This program causes it.  Compile as in comment #2.

Also, you have to revert the latest change to syslog-ng -3.6.2.ebuild (--with-embedded-crypto). I just put a copy of the old ebuild from CVS in an overlay.
Comment 11 Anthony Basile gentoo-dev 2015-04-10 22:11:09 UTC
(In reply to David Flogeras from comment #10)

I tried several times and I'm not able to hit it with your example.  However, I use kvm.  I'm going to reproduce your steps in comment #4 precisely and see if I can hit is then.
Comment 12 Anthony Basile gentoo-dev 2018-10-14 12:17:16 UTC
sys-libs/uclibc has been removed from the tree, replaced by sys-libs/uclibc-ng.  if this is still a problem on uclibc-ng, please open a new bug.