I have upgraded my system to use sys-apps/baselayout-1.12.4-r7 and it hangs my server after a random period of time. Daemons just die and refuse to die or restart :-( An example of the daemons that hang are: postfix, amavisd-new, bind, mysql...the list goes on! :-( Before you blame the daemons, when I downgrade to baselayout-1.11.15-r3 and run etc-update and reboot all is fine. What can I do to help you guys trouble shoot this one? My emerge --info is as follows: Portage 2.1-r2 (hardened/x86/2.6, gcc-3.3.6, glibc-2.3.6-r4, 2.6.16-hardened-r11 i686) ================================================================= System uname: 2.6.16-hardened-r11 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz Gentoo Base System version 1.6.15 distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.3 [disabled] app-admin/eselect-compiler: [Not Present] dev-lang/python: 2.3.5-r2, 2.4.3-r1 dev-python/pycrypto: 2.0.1-r5 dev-util/ccache: 2.3 dev-util/confcache: [Not Present] sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2 sys-devel/binutils: 2.16.1-r3 sys-devel/gcc-config: 1.3.13-r3 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.11-r5 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-march=pentium4 -O3 -pipe -fomit-frame-pointer" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/bind" CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo" CXXFLAGS="-march=pentium4 -O3 -pipe -fomit-frame-pointer" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig distlocks metadata-transfer sandbox sfperms strict" GENTOO_MIRRORS="http://212.219.56.146/sites/www.ibiblio.org/gentoo/ ftp://212.219.56.152/sites/www.ibiblio.org/gentoo/ http://212.219.56.142/sites/www.ibiblio.org/gentoo/ http://212.219.56.162/sites/www.ibiblio.org/gentoo/ ftp://212.219.56.142/sites/www.ibiblio.org/gentoo/" LC_ALL="C" MAKEOPTS="-j4" PKGDIR="/usr/portage//packages/x86/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage/" SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage" USE="apache2 apm berkdb bzip2 crypt curl curlwrappers dlloader doc gd gdbm gif gmp gpm hardened idn innodb jpeg libg++ libwww mysql ncurses nls nptl nptlonly pam pcre perl php pic png python readline session snmp ssl tcpd tetex tiff truetype userlocales winbind x86 xml xml2 xorg zlib elibc_glibc input_devices_mouse input_devices_keyboard kernel_linux userland_GNU" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LDFLAGS, LINGUAS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
I'd like to chime in here too. I noticed boot problems just as the checkfs script was running. Apparently when that script is run, the system stops booting and there is a segfault reported from w/in rc and function.sh. I am going to assume that start-stop-daemon is the culprit but can't be sure. I can't copy the screen output because I can't use gpm at that stage of boot. Init bombs to a maintenance prompt and interestingly, rebooting (pressing ctrl-d) at this point solves the problem and the next boot is fine. Downgrading to 1.11.15-r3 solved the problem. Now, don't get all hot and bothered, but this IS on a reiser4 partition, and the problem does NOT occur on reiserfs. However, using other distros does not cause this problem. I can only assume that something has gone awry with this baselayout and for me, at least, downgrading was the solution. At first, I thought it was halt.sh not properly syncing or unmounting volumes, but that was not the case. Hope this information is useful. I know it's not too specific but the good news is that for me, at least, it's fixed.
and what filesystem are you using Richard Scott
I'm using just the standard reiserfs. Nothing new or fancy on this server as its a production email server supporting over 300 people! need to try and keep it stable ;-)
Just for grins, I emerged each of the unmasked 1.12-r? baselayout files and all of them had the problem I described. 1.11.15-r3 does not exhibit this problem. There are numerous differences between the two, so isolating the specific program which causes r4 to fail might be tough. I am leaning towards start-stop-daemon though since that's apparently where the crash occurs.
Created attachment 96874 [details] screenshot of error during boot This is slightly blurred, sorry, but the error shows two things that may help. 1) script rc, line 390, there is a segfault (2399) followed by all of the text from rc at that line with uninitialized vars. 2) script functions.sh, line 181, another segfault (2524) followed by splash "critical" again from the script itself. This appears to occur after the modules script and during the checkfs script. Again, after pressing ctrl-d, the fs unmounts and the subsequent reboot proceeds normally. This occurs in all unmasked 1.12.x baselayouts.
your system/bash is screwed up then ... bash should never segfault
*** This bug has been marked as a duplicate of 144093 ***
(In reply to comment #6) > your system/bash is screwed up then ... bash should never segfault > Not sure if it's bash or start-stop-daemon. Nonetheless, it is what it is. I still think it's an R4 issue and this still does not explain why on a reboot everything is fine!
(In reply to comment #6) > your system/bash is screwed up then ... bash should never segfault > If its bash then how come downgrading baselayout fixes it?
franky, i dont care feel free to debug the issue and find the root cause; Gentoo devs arent going to spend any time on it until you can show reiserfs4 isnt the problem Richard: are you seeing segfaults as well ? you didnt really articulate the behavior you're seeing too well ..
(In reply to comment #10) > franky, i dont care > > feel free to debug the issue and find the root cause; Gentoo devs arent going > to spend any time on it until you can show reiserfs4 isnt the problem > > Richard: are you seeing segfaults as well ? you didnt really articulate the > behavior you're seeing too well .. > No, my system is NOT using reiserfs4 and boots to a login prompt ok.....it then slowly dies. after downgrading baselayout and its rock solid!, I've tried upgrading baselayout a couple of times and it then slowly dies again. This issue is defo not a reiserfs4 issue! :-)
Richard: can you clone this bug then so that we can start fresh ? this one has become cluttered with offtopic reiserfs4
Cloned as: http://bugs.gentoo.org/show_bug.cgi?id=147622