After some update the openrc starts to complain about 'not a valid runlevel' and services from the 'default' runlevel are not started. I am using: =sys-apps/sysvinit-2.99-r1 (-ibm -selinux -static) =sys-apps/openrc-0.44.10 (ncurses netifrc pam unicode -audit -bash -debug -newnet -selinux -sysv-utils In /etc/inittab: id:3:initdefault: si::sysinit:/sbin/openrc sysinit rc::bootwait:/sbin/openrc boot l0u:0:wait:/sbin/telinit u l0:0:wait:/sbin/openrc shutdown l0s:0:wait:/sbin/halt.sh l1:1:wait:/sbin/openrc single l2:2:wait:/sbin/openrc nonetwork l3:3:wait:/sbin/openrc default l4:4:wait:/sbin/openrc default l5:5:wait:/sbin/openrc default l6u:6:wait:/sbin/telinit u l6:6:wait:/sbin/openrc reboot l6r:6:wait:/sbin/reboot -dkn ... In /etc/rc.conf: #rc_parallel="NO" rc_logger="YES" ... In /var/log/rc.log: rc default logging started at Mon Mar 28 09:47:50 2022 * q8V: not a valid runlevel rc default logging stopped at Mon Mar 28 09:47:50 2022 The three characters seems to be uninitialized non-sense, e.g. on another boot: rc default logging started at Mon Mar 28 16:41:22 2022 * l*V: not a valid runlevel rc default logging stopped at Mon Mar 28 16:41:22 2022 # runlevel N 3 # cat /var/run/runlevel 3 # rc-status Runlevel: shutdown killprocs [ stopped ] savecache [ stopped ] mount-ro [ stopped ] Dynamic Runlevel: hotplugged Dynamic Runlevel: needed/wanted dbus [ crashed ] cupsd [ crashed ] Dynamic Runlevel: manual alsasound [ started ] fail2ban [ crashed ] sshd [ started ] fcron [ stopping ] I tried to recompile openrc and all deps, but it didn't helped. Reproducible: Always Steps to Reproduce: 1. Boot the system 2. 3. Actual Results: q8V: not a valid runlevel Services from the default runlevel are not started Expected Results: Services from the default runlevel are started
# emerge --info Portage 3.0.30 (python 3.9.9-final-0, default/linux/amd64/17.1/desktop, gcc-11.2.1, glibc-2.34-r10, 5.15.26-gentoo x86_64) ================================================================= System uname: Linux-5.15.26-gentoo-x86_64-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6400+-with-glibc2.34 KiB Mem: 8159452 total, 6926880 free KiB Swap: 17407996 total, 17407996 free Timestamp of repository gentoo: Mon, 28 Mar 2022 12:15:01 +0000 Head commit of repository gentoo: 160868821cd22f8a3bc9b5f3a3f181da30cbeae2 sh bash 5.1_p16 ld GNU ld (Gentoo 2.37_p1 p2) 2.37 distcc 3.4 x86_64-pc-linux-gnu [disabled] ccache version 4.5.1 [enabled] app-misc/pax-utils: 1.3.3::gentoo app-shells/bash: 5.1_p16::gentoo dev-java/java-config: 2.3.1::gentoo dev-lang/perl: 5.34.0-r6::gentoo dev-lang/python: 2.7.18_p14::gentoo, 3.7.13::gentoo, 3.9.9-r1::gentoo, 3.10.2_p1::gentoo dev-lang/rust: 1.58.1::gentoo dev-util/ccache: 4.5.1::gentoo dev-util/cmake: 3.22.2::gentoo dev-util/meson: 0.60.3::gentoo sys-apps/baselayout: 2.7-r3::gentoo sys-apps/openrc: 0.44.10::gentoo sys-apps/sandbox: 2.29::gentoo sys-devel/autoconf: 2.13-r1::gentoo, 2.71-r1::gentoo sys-devel/automake: 1.13.4-r2::gentoo, 1.16.4::gentoo sys-devel/binutils: 2.37_p1-r2::gentoo sys-devel/binutils-config: 5.4.1::gentoo sys-devel/clang: 13.0.1::gentoo sys-devel/gcc: 11.2.1_p20220115::gentoo sys-devel/gcc-config: 2.5-r1::gentoo sys-devel/libtool: 2.4.6-r6::gentoo sys-devel/lld: 13.0.1::gentoo sys-devel/llvm: 13.0.1::gentoo sys-devel/make: 4.3::gentoo sys-kernel/linux-headers: 5.15-r3::gentoo (virtual/os-headers) sys-libs/glibc: 2.34-r10::gentoo Repositories: gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://rsync.europe.gentoo.org/gentoo-portage priority: -1000 sync-rsync-verify-jobs: 1 sync-rsync-verify-max-age: 24 sync-rsync-verify-metamanifest: yes sync-rsync-extra-opts: x-portage location: /usr/local/portage masters: gentoo priority: 0 fedora location: /var/lib/layman/fedora masters: gentoo priority: 50 steam-overlay location: /var/lib/layman/steam-overlay masters: gentoo priority: 50 Installed sets: @system ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="*" CBUILD="x86_64-pc-linux-gnu" CC="gcc" CFLAGS="-O2 -march=athlon64 -mtune=athlon64 -pipe -fstack-protector" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php8.0/ext-active/ /etc/php/cgi-php8.0/ext-active/ /etc/php/cli-php8.0/ext-active/ /etc/php/fpm-php8.0/ext-active/ /etc/php/phpdbg-php8.0/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c" CXX="g++" CXXFLAGS="-O2 -march=athlon64 -mtune=athlon64 -pipe -fstack-protector" DISTDIR="/usr/portage/distfiles" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live ccache config-protect-if-modified distlocks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch parallel-install pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://distfiles.gentoo.org ftp://ftp.sh.cvut.cz/MIRRORS/gentoo/gentoo" LANG="cs_CZ.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LINGUAS="cs en" MAKEOPTS="-j4 -l2.0" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" SHELL="/bin/bash" USE="3dnow 3dnowext 7zip X X509 a52 aac aalib acl acpi additions afterimage aio alsa amd64 amr amrnb amrwb apache2 apng artswrappersuid authfile auto-hinter bash-completion blender-game bluetooth branding bzip2 cairo ccache cdda cddb cdio cdr cdrom cdsound cgi chroot clamav clamd clamdtop clang cli cmdsubmenu crypt cscope cuda cups custom-cflags custom-optimization dbus declarative dedicated device-mapper dia directfb doc down-root dri dts dv dvb dvd dvdnav dvdr dynload elogind emerald enca encode exif extensions extra fat fbcon ffmpeg fftw flac flash fontconfig fortran freetts ftp fts3 fuse g3dvl gallium games gbm gd gdbm gdu geoip gif glitz glut gmp gold gpm graphics gstreamer gtk gudev gui harfbuzz hddtemp hpn humanities iconv icq icu ident iptv ipv6 irc jabber jadetex jamu java javafx javascript jit joystick jpeg kdrive kerberos kpathsea laptop lcms libglvnd libnotify libsamplerate libtirpc lirc lm_sensors logrotate logwatch lzma lzo mad mainmenuhooks math mbrola md5sum mikmod minizip mmxext mng mod mouse mozdevelop mp2 mp3 mp4 mpeg mpeg2 mpeg3 mplayer msn multilib multislot multiuser music mysql mysqli nas ncurses nls nptl nsplugin ntfs ntfsprogs nvidia nvram ogg opencl opengl openmp pam pango pcre pda pdf php pixbuf png policykit ppds pstricks publishers python qt5 rar rdesktop readline rss rtc rtsp samba sasl savedconfig science screen sdl seamonkey seccomp sensord setup setup-plugin sip sipim slang smime sound sounds sox spell split-usr srt sse3 ssl startup-notification stream submenu subtitles subversion suid svg sysfs syslog system-cairo system-icu system-jpeg system-sqlite tex4ht theora threads threadsafe tiff timercmd timerinfo tk truetype ttxtsubs udev udisks unicode unsupported upnp upower usb uvm v4l2 vcd vdpau vdr vim-syntax vim-with-x vlc vnc volctrl vorbis wav wifi wmf wxwidgets wxwindows x264 x265 xattr xcb xcomposite xetex xft xine xinerama xml xosd xplanet xpm xscreensaver xv xvid xvmc zip zlib" ABI_X86="64" ADA_TARGET="gnat_2020" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_core authn_dbd authn_dbm authn_default authn_file authz_core authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info lbmethod_byrequests lbmethod_bytraffic lbmethod_bybusyness lbmethod_heartbeat log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif slotmem_shm so socache_shmcb speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="3dnow 3dnowext mmx mmxext sse sse2 sse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev libinput" KERNEL="linux" L10N="cs en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-4 php8-0" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby26 ruby27" SANE_BACKENDS="epson2 epson2" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LEX, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
After keywording and updating to: =sys-apps/sysvinit-3.01 # cat /var/log/rc.log rc default logging started at Mon Mar 28 17:06:09 2022 * Stopping watchdog ... [ ok ] * Saving random seed ... [ ok ] * Stopping shorewall6 ... [ ok ] * Stopping shorewall ... [ ok ] * Bringing down interface lo * Running postdown ... * Bringing down interface eth0 * Stopping dhclient on eth0 ... [ ok ] * Running postdown ... * Stopping acpid ... [ ok ] * Stopping metalog ... [ ok ] * Setting hardware clock using the system clock [Local Time] ... [ ok ] rc default logging stopped at Mon Mar 28 17:06:13 2022 # rc-status Runlevel: shutdown killprocs [ started ] savecache [ started ] mount-ro [ started ] Dynamic Runlevel: hotplugged Dynamic Runlevel: needed/wanted Dynamic Runlevel: manual sshd [ started ] fcron [ stopping ] No idea why is it behaving this crazy way.
sorry, are you saying it's fine with newer sysvinit?
(In reply to Sam James from comment #3) > sorry, are you saying it's fine with newer sysvinit? No it isn't, I should say it's even worse. Now it doesn't output 'not a valid runlevel' but it's running shutdown (without shutting down the machine), thus it shuts down the network interface which is really bad on a remote machine.
I will try rebuilding sysvinit and openrc without optimization flags. I currently have: =sys-libs/glibc-2.34-r10 =sys-devel/gcc-11.2.1_p20220115 The system was running for more than ten years without problem (with regular updates).
(In reply to Yarda from comment #5) > I will try rebuilding sysvinit and openrc without optimization flags. I > currently have: > =sys-libs/glibc-2.34-r10 > =sys-devel/gcc-11.2.1_p20220115 > > The system was running for more than ten years without problem (with regular > updates). Which flags are you using? Your --info was just -O2 which is fine?
(In reply to Sam James from comment #6) > (In reply to Yarda from comment #5) > > I will try rebuilding sysvinit and openrc without optimization flags. I > > currently have: > > =sys-libs/glibc-2.34-r10 > > =sys-devel/gcc-11.2.1_p20220115 > > > > The system was running for more than ten years without problem (with regular > > updates). > > Which flags are you using? Your --info was just -O2 which is fine? I tried -O0, it didn't help. Also the update of sysvinit didn't change anything. It behaves inconsistently, i.e. sometimes it complains about wrong runlevel, sometimes it doesn't. Sometimes it even correctly boots, but mostly it resulted in the rc-status 'shutdown' runlevel. It seems like some uninitialized memory. I created simple wrapper /sbin/openrc2: # cat /sbin/openrc2 #!/bin/bash /bin/date >> /tmp/log echo ."$@". >> /tmp/log /sbin/openrc "$@" Replace /sbin/openrc with the /sbin/openrc2 in the inittab, rebooted and resulting file is attached. It's still in the shutdown runlevel: # rc-status Runlevel: shutdown killprocs [ stopped ] savecache [ stopped ] mount-ro [ stopped ] Dynamic Runlevel: hotplugged Dynamic Runlevel: needed/wanted dbus [ crashed ] cupsd [ crashed ] Dynamic Runlevel: manual alsasound [ started ] fail2ban [ crashed ] clamd [ crashed ] sshd [ started ] fcron [ stopping ] I am going to try replacing mine inittab with the stock one, but I am unable to spot anything wrong there.
Created attachment 768082 [details] openrc2 wrapper log
Agreed. Would you be able to try build OpenRC with ASAN and then MSAN? -fsanitizse=address and -fsanitizse=memory. There's also UBSAN we can try.
(In reply to Sam James from comment #9) > Agreed. Would you be able to try build OpenRC with ASAN and then MSAN? > > -fsanitizse=address and -fsanitizse=memory. > > There's also UBSAN we can try. I thought the problem is in the sysvinit, because it was calling openrc with garbage, I compiled it with the: -fsanitize=address -lasan -fsanitize=memory is not suported But it didn't show anything, just: Run /sbin/init as init process WARNING: reading executable name failed with errno 2, some stack frames may not be symbolized WARNING: ASan is ignoring requested __asan_handle_no_return: stack type: default top: .... False positive error reports may fallow Regarding the openrc I wasn't able to compile it with the asan: ... The Meson build system Version: 0.60.3 Source dir: /var/tmp/portage/sys-apps/openrc-0.44.10/work/openrc-0.44.10 Build dir: /var/tmp/portage/sys-apps/openrc-0.44.10/work/openrc-0.44.10-build Build type: native build Project name: OpenRC Project version: 0.44.10 ==105==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD. But I think I could be able to run it through the valgrind.
I wrapped valgrind to runlevel3, inittab: ... l3:3:wait:/sbin/openrc2 default ... # cat /sbin/openrc2 #!/bin/bash d=`/bin/date` /usr/bin/valgrind --log-file="/tmp/$d" /sbin/openrc "$@" # cat /tmp/Mon* ==3886== Memcheck, a memory error detector ==3886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==3886== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==3886== Command: /sbin/openrc default ==3886== Parent PID: 3884 ==3886== ==3886== Warning: noted but unhandled ioctl 0x5441 with no size/direction hints. ==3886== This could cause spurious value errors to appear. ==3886== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==3887== ==3887== HEAP SUMMARY: ==3887== in use at exit: 8,208 bytes in 2 blocks ==3887== total heap usage: 783 allocs, 781 frees, 390,253 bytes allocated ==3887== ==3887== LEAK SUMMARY: ==3887== definitely lost: 8,208 bytes in 2 blocks ==3887== indirectly lost: 0 bytes in 0 blocks ==3887== possibly lost: 0 bytes in 0 blocks ==3887== still reachable: 0 bytes in 0 blocks ==3887== suppressed: 0 bytes in 0 blocks ==3887== Rerun with --leak-check=full to see details of leaked memory ==3887== ==3887== For lists of detected and suppressed errors, rerun with: -s ==3887== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ==3886== ==3886== HEAP SUMMARY: ==3886== in use at exit: 24,841 bytes in 13 blocks ==3886== total heap usage: 8,589 allocs, 8,576 frees, 5,762,486 bytes allocated ==3886== ==3886== LEAK SUMMARY: ==3886== definitely lost: 24,744 bytes in 7 blocks ==3886== indirectly lost: 97 bytes in 6 blocks ==3886== possibly lost: 0 bytes in 0 blocks ==3886== still reachable: 0 bytes in 0 blocks ==3886== suppressed: 0 bytes in 0 blocks ==3886== Rerun with --leak-check=full to see details of leaked memory ==3886== ==3886== For lists of detected and suppressed errors, rerun with: -s ==3886== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) # runlevel N 3 # rc-status Runlevel: shutdown ... No idea why is it in the openrc runlevel 'shutdown' without networking, sshd and every service from the 'default' runlevel.
I noticed my wtmp was over 16 MB, I cleared it and recreated but it didn't help. I can recognize 3 states happening randomly: 1) upon entering runlevel 3 it complains about invalid runlevel, it ends in the 'shutdown' openrc runlevel 2) upon entering runlevel 3 it doesn't complain, but it ends in the 'shutdown' openrc runlevel 3) it seems it boots OK, ended in the 'default' openrc runlevel, services are running and runlevel returns "N 3", but it's probably unsure about the current runlevel, because the 'reboot' command shutdowns the machine
(In reply to Yarda from comment #12) > 3) it seems it boots OK, ended in the 'default' openrc runlevel, services > are running and runlevel returns "N 3", but it's probably unsure about the > current runlevel, because the 'reboot' command shutdowns the machine The shutdown instead of the reboot happening in 3) is probably unrelated to this problem. In the original sysvinit inittab it's calling '/sbin/reboot -dkn', where the -k is downstream patched-in option for kexec which calls reboot(LINUX_REBOOT_CMD_KEXEC), from the 'man 2 reboot': LINUX_REBOOT_CMD_KEXEC (RB_KEXEC, 0x45584543, since Linux 2.6.13). Execute a kernel that has been loaded earlier with kexec_load(2). This option is available only if the kernel was configured with CONFIG_KEXEC. It seems the patch doesn't call kexec_load(), although undocumented, I suppose it will boot the current kernel, but the main problem is that I don't have the 'CONFIG_KEXEC' in my kernel. I will try patching out the -k from my inittab and I suppose the reboot will start working. I will test it when I get physically to the machine later today.
(In reply to Yarda from comment #11) > # cat /tmp/Mon* > ==3886== Memcheck, a memory error detector > ==3886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. > ==3886== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info > ==3886== Command: /sbin/openrc default > ==3886== Parent PID: 3884 > ==3886== > ==3886== Warning: noted but unhandled ioctl 0x5441 with no size/direction > hints. > ==3886== This could cause spurious value errors to appear. > ==3886== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a > proper wrapper. This is probably the TIOCGPTPEER ioctl. Newer kernels expose the ioctl TIOCGPTPEER call to userspace which allows to safely allocate a file descriptor for a pty slave based solely on the master file descriptor. This shouldn't cause the false negative report.
(In reply to Yarda from comment #8) > Created attachment 768082 [details] > openrc2 wrapper log The intermediary NULLs can be unrelated to this problem, bad IO cache flush etc. I will try rebuilding glibc, downgrade gcc, downgrade openrc/sysvinit. Instrument debug points into the openrc or boot it through the debugger (when I will be onsite). The random nature of this problem complicates debugging.
(In reply to Yarda from comment #10) > ==105==ASan runtime does not come first in initial library list; you should > either link runtime to your application or manually preload it with > LD_PRELOAD. > As a workaround, please temporarily compile with FEATURES="-sandbox -usersandbox" emerge -v1 ... It should work then.
https://youtu.be/_6sWR4JQRaw Unfortunately, I hadn't serial console. There are a lot of memory leaks which aren't probably related to this problem. Near the end (00:59) there is heap overflow error. It's hard to spot, because the LCD is quite slow, but it seems it's happening in the src/rc/rc.c:896 and src/librc/librc.c:487, probably the snprintf. I guess it's because there is some garbage in the runlevel and in the: snprintf(path, sizeof(path), "%s/%s", RC_RUNLEVELDIR, runlevel); there should be probaly sizeof(path) - 1 and the path should be explicitly initialized to 0 for the heap overflow not to happen, but it doesn't explain from where is the runlevel garbage coming. There are also some minor service start errors which shouldn't be related to this problem. Regarding the kexec reboot, I verified that without the '-k' the reboot is working OK. I will probably open upstream kernel bugzilla about it, because I think if the KEXEC is unsupported it should reboot not shutdown (I am pretty sure it worked this way some time ago).
I got the text log, I will post it in the next comment.
Created attachment 768164 [details] Heap overflow error This is the heap overflow error, the memory leaks weren't logged (the reason is currently unknown to me).
The garbage gets in on src/rc.c:875: krunlevel = get_krunlevel();
I finally got it, the problem was caused by misconfiguration (what else? :) which was in place and working for more than 10 years :) The machine has SSD and HDDs, the SSD is used only for speed critical dirs, the following was in the fstab: ... /mnt/data/var /var none bind 0 0 /mnt/data/run /run none bind 0 0 ... And in the /mnt/data/run/openrc/krunlevel was 'shutdown'. The garbage was probably read if the race was hit during the remount of the FS over the tmpfs. Sorry for noise.
(In reply to Yarda from comment #21) > I finally got it, the problem was caused by misconfiguration (what else? :) > which was in place and working for more than 10 years :) > > The machine has SSD and HDDs, the SSD is used only for speed critical dirs, > the following was in the fstab: > > ... > /mnt/data/var /var none bind 0 0 > /mnt/data/run /run none bind 0 0 > ... > > And in the /mnt/data/run/openrc/krunlevel was 'shutdown'. The garbage was > probably read if the race was hit during the remount of the FS over the > tmpfs. > > Sorry for noise. Thanks for updating! FWIW, we really shouldn't ever overflow like that even if garbage is given.
(In reply to Sam James from comment #22) > (In reply to Yarda from comment #21) > > I finally got it, the problem was caused by misconfiguration (what else? :) > > which was in place and working for more than 10 years :) > > > > The machine has SSD and HDDs, the SSD is used only for speed critical dirs, > > the following was in the fstab: > > > > ... > > /mnt/data/var /var none bind 0 0 > > /mnt/data/run /run none bind 0 0 > > ... > > > > And in the /mnt/data/run/openrc/krunlevel was 'shutdown'. The garbage was > > probably read if the race was hit during the remount of the FS over the > > tmpfs. > > > > Sorry for noise. > > Thanks for updating! FWIW, we really shouldn't ever overflow like that even > if garbage is given. NP, there is probably also kernel (or maybe glibc) bug, because I think the read shouldn't return garbage upon bind remount. I will try to strip this down and I will probably also open upstream kernel bug.
Please keep us updated either way, as I'm definitely interested. Sadly, I can't reproduce with my attempts so far to put junk in the krunlevel file, but I'll still see if the code looks right.
From the strace it doesn't seem to be kernel bug: ... write(1, "newlevel1: .default.\n", 21) = 21 # instrumented debug output showing content of the newlevel before get_krunlevel rt_sigaction(SIGCHLD, {sa_handler=0x559791083bb0, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f35ecea2790}, NULL, 8) = 0 write(1, "krunlevel\n", 10) = 10 # instrumented debug output showing we reached the get_krunlevel newfstatat(AT_FDCWD, "/run/openrc/krunlevel", {st_dev=makedev(0x8, 0x15), st_ino=35782679, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1648943081 /* 2022-04-03T01:44:41.781000000+0200 */, st_atime_nsec=781000000, st_mtime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_mtime_nsec=604915136, st_ctime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_ctime_nsec=604915136}, 0) = 0 openat(AT_FDCWD, "/run/openrc/krunlevel", O_RDONLY) = 3 newfstatat(3, "", {st_dev=makedev(0x8, 0x15), st_ino=35782679, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1648943081 /* 2022-04-03T01:44:41.781000000+0200 */, st_atime_nsec=781000000, st_mtime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_mtime_nsec=604915136, st_ctime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_ctime_nsec=604915136}, AT_EMPTY_PATH) = 0 read(3, "", 4096) = 0 # empty string read, not garbage close(3) = 0 newfstatat(AT_FDCWD, "/run/openrc/krunlevel", {st_dev=makedev(0x8, 0x15), st_ino=35782679, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1648943081 /* 2022-04-03T01:44:41.781000000+0200 */, st_atime_nsec=781000000, st_mtime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_mtime_nsec=604915136, st_ctime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_ctime_nsec=604915136}, 0) = 0 unlink("/run/openrc/krunlevel") = 0 write(1, "newlevel3: .\250v\264\313\222U.\n", 20) = 20 # instrumented debug output showing content of the newlevel after get_krunlevel and there is garbage now ... Maybe it's glibc bug? I am going to focus on the getline.