Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 836325 - =sys-apps/openrc-0.44.10: q8V: not a valid runlevel
Summary: =sys-apps/openrc-0.44.10: q8V: not a valid runlevel
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: OpenRC (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: OpenRC Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-28 14:50 UTC by Yarda
Modified: 2022-04-03 08:54 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
openrc2 wrapper log (log,119 bytes, text/plain)
2022-03-28 16:26 UTC, Yarda
Details
Heap overflow error (rc.log,2.96 KB, text/x-log)
2022-03-29 18:33 UTC, Yarda
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Yarda 2022-03-28 14:50:34 UTC
After some update the openrc starts to complain about 'not a valid runlevel' and services from the 'default' runlevel are not started. I am using:
=sys-apps/sysvinit-2.99-r1 (-ibm -selinux -static)
=sys-apps/openrc-0.44.10 (ncurses netifrc pam unicode -audit -bash -debug -newnet -selinux -sysv-utils

In /etc/inittab:
id:3:initdefault:
si::sysinit:/sbin/openrc sysinit 
rc::bootwait:/sbin/openrc boot 
l0u:0:wait:/sbin/telinit u                                                                                                             
l0:0:wait:/sbin/openrc shutdown                                                                                                        
l0s:0:wait:/sbin/halt.sh                                                                                                               
l1:1:wait:/sbin/openrc single                                                                                                          
l2:2:wait:/sbin/openrc nonetwork                                                                                                       
l3:3:wait:/sbin/openrc default                                                                                                         
l4:4:wait:/sbin/openrc default                                                                                                         
l5:5:wait:/sbin/openrc default                                                                                                         
l6u:6:wait:/sbin/telinit u                                                                                                             
l6:6:wait:/sbin/openrc reboot                                                                                                          
l6r:6:wait:/sbin/reboot -dkn                                                                                                           
...

In /etc/rc.conf:
#rc_parallel="NO"
rc_logger="YES"
...

In /var/log/rc.log:
rc default logging started at Mon Mar 28 09:47:50 2022                                                                                 
                                                                                                                                       
 * q8V: not a valid runlevel                                                                                                           
                                                                                                                                       
rc default logging stopped at Mon Mar 28 09:47:50 2022                                                                                 

The three characters seems to be uninitialized non-sense, e.g. on another boot:
rc default logging started at Mon Mar 28 16:41:22 2022                                                                                 
                                                                                                                                       
 * l*V: not a valid runlevel                                                                                                           
                                                                                                                                       
rc default logging stopped at Mon Mar 28 16:41:22 2022                                                                                 

# runlevel
N 3

# cat /var/run/runlevel 
3

# rc-status
Runlevel: shutdown
 killprocs                                                                                                                [  stopped  ]
 savecache                                                                                                                [  stopped  ]
 mount-ro                                                                                                                 [  stopped  ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed/wanted
 dbus                                                                                                                     [  crashed  ]
 cupsd                                                                                                                    [  crashed  ]
Dynamic Runlevel: manual
 alsasound                                                                                                                [  started  ]
 fail2ban                                                                                                                 [  crashed  ]
 sshd                                                                                                                     [  started  ]
 fcron                                                                                                                    [ stopping  ]

I tried to recompile openrc and all deps, but it didn't helped.


Reproducible: Always

Steps to Reproduce:
1. Boot the system
2.
3.
Actual Results:  
q8V: not a valid runlevel 
Services from the default runlevel are not started

Expected Results:  
Services from the default runlevel are started
Comment 1 Yarda 2022-03-28 14:51:17 UTC
# emerge --info
Portage 3.0.30 (python 3.9.9-final-0, default/linux/amd64/17.1/desktop, gcc-11.2.1, glibc-2.34-r10, 5.15.26-gentoo x86_64)
=================================================================
System uname: Linux-5.15.26-gentoo-x86_64-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_6400+-with-glibc2.34
KiB Mem:     8159452 total,   6926880 free
KiB Swap:   17407996 total,  17407996 free
Timestamp of repository gentoo: Mon, 28 Mar 2022 12:15:01 +0000
Head commit of repository gentoo: 160868821cd22f8a3bc9b5f3a3f181da30cbeae2
sh bash 5.1_p16
ld GNU ld (Gentoo 2.37_p1 p2) 2.37
distcc 3.4 x86_64-pc-linux-gnu [disabled]
ccache version 4.5.1 [enabled]
app-misc/pax-utils:        1.3.3::gentoo
app-shells/bash:           5.1_p16::gentoo
dev-java/java-config:      2.3.1::gentoo
dev-lang/perl:             5.34.0-r6::gentoo
dev-lang/python:           2.7.18_p14::gentoo, 3.7.13::gentoo, 3.9.9-r1::gentoo, 3.10.2_p1::gentoo
dev-lang/rust:             1.58.1::gentoo
dev-util/ccache:           4.5.1::gentoo
dev-util/cmake:            3.22.2::gentoo
dev-util/meson:            0.60.3::gentoo
sys-apps/baselayout:       2.7-r3::gentoo
sys-apps/openrc:           0.44.10::gentoo
sys-apps/sandbox:          2.29::gentoo
sys-devel/autoconf:        2.13-r1::gentoo, 2.71-r1::gentoo
sys-devel/automake:        1.13.4-r2::gentoo, 1.16.4::gentoo
sys-devel/binutils:        2.37_p1-r2::gentoo
sys-devel/binutils-config: 5.4.1::gentoo
sys-devel/clang:           13.0.1::gentoo
sys-devel/gcc:             11.2.1_p20220115::gentoo
sys-devel/gcc-config:      2.5-r1::gentoo
sys-devel/libtool:         2.4.6-r6::gentoo
sys-devel/lld:             13.0.1::gentoo
sys-devel/llvm:            13.0.1::gentoo
sys-devel/make:            4.3::gentoo
sys-kernel/linux-headers:  5.15-r3::gentoo (virtual/os-headers)
sys-libs/glibc:            2.34-r10::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.europe.gentoo.org/gentoo-portage
    priority: -1000
    sync-rsync-verify-jobs: 1
    sync-rsync-verify-max-age: 24
    sync-rsync-verify-metamanifest: yes
    sync-rsync-extra-opts: 

x-portage
    location: /usr/local/portage
    masters: gentoo
    priority: 0

fedora
    location: /var/lib/layman/fedora
    masters: gentoo
    priority: 50

steam-overlay
    location: /var/lib/layman/steam-overlay
    masters: gentoo
    priority: 50

Installed sets: @system
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="*"
CBUILD="x86_64-pc-linux-gnu"
CC="gcc"
CFLAGS="-O2 -march=athlon64 -mtune=athlon64 -pipe -fstack-protector"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php8.0/ext-active/ /etc/php/cgi-php8.0/ext-active/ /etc/php/cli-php8.0/ext-active/ /etc/php/fpm-php8.0/ext-active/ /etc/php/phpdbg-php8.0/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXX="g++"
CXXFLAGS="-O2 -march=athlon64 -mtune=athlon64 -pipe -fstack-protector"
DISTDIR="/usr/portage/distfiles"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg-live ccache config-protect-if-modified distlocks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch parallel-install pid-sandbox preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org ftp://ftp.sh.cvut.cz/MIRRORS/gentoo/gentoo"
LANG="cs_CZ.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="cs en"
MAKEOPTS="-j4 -l2.0"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
SHELL="/bin/bash"
USE="3dnow 3dnowext 7zip X X509 a52 aac aalib acl acpi additions afterimage aio alsa amd64 amr amrnb amrwb apache2 apng artswrappersuid authfile auto-hinter bash-completion blender-game bluetooth branding bzip2 cairo ccache cdda cddb cdio cdr cdrom cdsound cgi chroot clamav clamd clamdtop clang cli cmdsubmenu crypt cscope cuda cups custom-cflags custom-optimization dbus declarative dedicated device-mapper dia directfb doc down-root dri dts dv dvb dvd dvdnav dvdr dynload elogind emerald enca encode exif extensions extra fat fbcon ffmpeg fftw flac flash fontconfig fortran freetts ftp fts3 fuse g3dvl gallium games gbm gd gdbm gdu geoip gif glitz glut gmp gold gpm graphics gstreamer gtk gudev gui harfbuzz hddtemp hpn humanities iconv icq icu ident iptv ipv6 irc jabber jadetex jamu java javafx javascript jit joystick jpeg kdrive kerberos kpathsea laptop lcms libglvnd libnotify libsamplerate libtirpc lirc lm_sensors logrotate logwatch lzma lzo mad mainmenuhooks math mbrola md5sum mikmod minizip mmxext mng mod mouse mozdevelop mp2 mp3 mp4 mpeg mpeg2 mpeg3 mplayer msn multilib multislot multiuser music mysql mysqli nas ncurses nls nptl nsplugin ntfs ntfsprogs nvidia nvram ogg opencl opengl openmp pam pango pcre pda pdf php pixbuf png policykit ppds pstricks publishers python qt5 rar rdesktop readline rss rtc rtsp samba sasl savedconfig science screen sdl seamonkey seccomp sensord setup setup-plugin sip sipim slang smime sound sounds sox spell split-usr srt sse3 ssl startup-notification stream submenu subtitles subversion suid svg sysfs syslog system-cairo system-icu system-jpeg system-sqlite tex4ht theora threads threadsafe tiff timercmd timerinfo tk truetype ttxtsubs udev udisks unicode unsupported upnp upower usb uvm v4l2 vcd vdpau vdr vim-syntax vim-with-x vlc vnc volctrl vorbis wav wifi wmf wxwidgets wxwindows x264 x265 xattr xcb xcomposite xetex xft xine xinerama xml xosd xplanet xpm xscreensaver xv xvid xvmc zip zlib" ABI_X86="64" ADA_TARGET="gnat_2020" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_core authn_dbd authn_dbm authn_default authn_file authz_core authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info lbmethod_byrequests lbmethod_bytraffic lbmethod_bybusyness lbmethod_heartbeat log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif slotmem_shm so socache_shmcb speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="3dnow 3dnowext mmx mmxext sse sse2 sse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="evdev libinput" KERNEL="linux" L10N="cs en" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php7-4 php8-0" POSTGRES_TARGETS="postgres12 postgres13" PYTHON_SINGLE_TARGET="python3_9" PYTHON_TARGETS="python3_9" RUBY_TARGETS="ruby26 ruby27" SANE_BACKENDS="epson2 epson2" USERLAND="GNU" VIDEO_CARDS="nvidia" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq proto steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  ADDR2LINE, AR, ARFLAGS, AS, ASFLAGS, CCLD, CONFIG_SHELL, CPP, CPPFLAGS, CTARGET, CXXFILT, ELFEDIT, EMERGE_DEFAULT_OPTS, EXTRA_ECONF, F77FLAGS, FC, GCOV, GPROF, INSTALL_MASK, LC_ALL, LD, LEX, LFLAGS, LIBTOOL, MAKE, MAKEFLAGS, NM, OBJCOPY, OBJDUMP, PORTAGE_BINHOST, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, RANLIB, READELF, RUSTFLAGS, SIZE, STRINGS, STRIP, YACC, YFLAGS
Comment 2 Yarda 2022-03-28 15:15:36 UTC
After keywording and updating to: =sys-apps/sysvinit-3.01

# cat /var/log/rc.log

rc default logging started at Mon Mar 28 17:06:09 2022
 
 * Stopping watchdog ...
 [ ok ]
 * Saving random seed ...
 [ ok ]
 * Stopping shorewall6 ...
 [ ok ]
 * Stopping shorewall ...
 [ ok ]
 * Bringing down interface lo
 *   Running postdown ...
 * Bringing down interface eth0
 *   Stopping dhclient on eth0 ...
 [ ok ]
 *   Running postdown ...
 * Stopping acpid ...
 [ ok ]
 * Stopping metalog ...
 [ ok ]
 * Setting hardware clock using the system clock [Local Time] ...
 [ ok ]
 
rc default logging stopped at Mon Mar 28 17:06:13 2022

# rc-status
Runlevel: shutdown
 killprocs                                                                                                                [  started  ]
 savecache                                                                                                                [  started  ]
 mount-ro                                                                                                                 [  started  ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed/wanted
Dynamic Runlevel: manual
 sshd                                                                                                                     [  started  ]
 fcron                                                                                                                    [ stopping  ]

No idea why is it behaving this crazy way.
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-28 15:23:13 UTC
sorry, are you saying it's fine with newer sysvinit?
Comment 4 Yarda 2022-03-28 15:27:33 UTC
(In reply to Sam James from comment #3)
> sorry, are you saying it's fine with newer sysvinit?

No it isn't, I should say it's even worse. Now it doesn't output 'not a valid runlevel' but it's running shutdown (without shutting down the machine), thus it shuts down the network interface which is really bad on a remote machine.
Comment 5 Yarda 2022-03-28 15:40:41 UTC
I will try rebuilding sysvinit and openrc without optimization flags. I currently have:
=sys-libs/glibc-2.34-r10
=sys-devel/gcc-11.2.1_p20220115

The system was running for more than ten years without problem (with regular updates).
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-28 16:11:05 UTC
(In reply to Yarda from comment #5)
> I will try rebuilding sysvinit and openrc without optimization flags. I
> currently have:
> =sys-libs/glibc-2.34-r10
> =sys-devel/gcc-11.2.1_p20220115
> 
> The system was running for more than ten years without problem (with regular
> updates).

Which flags are you using? Your --info was just -O2 which is fine?
Comment 7 Yarda 2022-03-28 16:24:30 UTC
(In reply to Sam James from comment #6)
> (In reply to Yarda from comment #5)
> > I will try rebuilding sysvinit and openrc without optimization flags. I
> > currently have:
> > =sys-libs/glibc-2.34-r10
> > =sys-devel/gcc-11.2.1_p20220115
> > 
> > The system was running for more than ten years without problem (with regular
> > updates).
> 
> Which flags are you using? Your --info was just -O2 which is fine?

I tried -O0, it didn't help. Also the update of sysvinit didn't change anything. It behaves inconsistently, i.e. sometimes it complains about wrong runlevel, sometimes it doesn't. Sometimes it even correctly boots, but mostly it resulted in the rc-status 'shutdown' runlevel. It seems like some uninitialized memory. I created simple wrapper /sbin/openrc2:

# cat /sbin/openrc2
#!/bin/bash

/bin/date >> /tmp/log
echo ."$@". >> /tmp/log
/sbin/openrc "$@"

Replace /sbin/openrc with the /sbin/openrc2 in the inittab, rebooted and resulting file is attached. It's still in the shutdown runlevel:
# rc-status
Runlevel: shutdown
 killprocs                                                                                                                [  stopped  ]
 savecache                                                                                                                [  stopped  ]
 mount-ro                                                                                                                 [  stopped  ]
Dynamic Runlevel: hotplugged
Dynamic Runlevel: needed/wanted
 dbus                                                                                                                     [  crashed  ]
 cupsd                                                                                                                    [  crashed  ]
Dynamic Runlevel: manual
 alsasound                                                                                                                [  started  ]
 fail2ban                                                                                                                 [  crashed  ]
 clamd                                                                                                                    [  crashed  ]
 sshd                                                                                                                     [  started  ]
 fcron                                                                                                                    [ stopping  ]

I am going to try replacing mine inittab with the stock one, but I am unable to spot anything wrong there.
Comment 8 Yarda 2022-03-28 16:26:02 UTC
Created attachment 768082 [details]
openrc2 wrapper log
Comment 9 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-28 16:30:22 UTC
Agreed. Would you be able to try build OpenRC with ASAN and then MSAN?

-fsanitizse=address and -fsanitizse=memory.

There's also UBSAN we can try.
Comment 10 Yarda 2022-03-28 17:36:53 UTC
(In reply to Sam James from comment #9)
> Agreed. Would you be able to try build OpenRC with ASAN and then MSAN?
> 
> -fsanitizse=address and -fsanitizse=memory.
> 
> There's also UBSAN we can try.

I thought the problem is in the sysvinit, because it was calling openrc with garbage, I compiled it with the:

-fsanitize=address -lasan

-fsanitize=memory is not suported

But it didn't show anything, just:
Run /sbin/init as init process
WARNING: reading executable name failed with errno 2, some stack frames may not be symbolized
WARNING: ASan is ignoring requested __asan_handle_no_return: stack type: default top: ....
False positive error reports may fallow

Regarding the openrc I wasn't able to compile it with the asan:
...
The Meson build system
Version: 0.60.3
Source dir: /var/tmp/portage/sys-apps/openrc-0.44.10/work/openrc-0.44.10
Build dir: /var/tmp/portage/sys-apps/openrc-0.44.10/work/openrc-0.44.10-build
Build type: native build
Project name: OpenRC
Project version: 0.44.10
==105==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.

But I think I could be able to run it through the valgrind.
Comment 11 Yarda 2022-03-28 17:59:23 UTC
I wrapped valgrind to runlevel3, inittab:
...
l3:3:wait:/sbin/openrc2 default
...

# cat /sbin/openrc2
#!/bin/bash

d=`/bin/date`
/usr/bin/valgrind --log-file="/tmp/$d" /sbin/openrc "$@"

# cat /tmp/Mon*
==3886== Memcheck, a memory error detector
==3886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3886== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==3886== Command: /sbin/openrc default
==3886== Parent PID: 3884
==3886== 
==3886== Warning: noted but unhandled ioctl 0x5441 with no size/direction hints.
==3886==    This could cause spurious value errors to appear.
==3886==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper.
==3887== 
==3887== HEAP SUMMARY:
==3887==     in use at exit: 8,208 bytes in 2 blocks
==3887==   total heap usage: 783 allocs, 781 frees, 390,253 bytes allocated
==3887== 
==3887== LEAK SUMMARY:
==3887==    definitely lost: 8,208 bytes in 2 blocks
==3887==    indirectly lost: 0 bytes in 0 blocks
==3887==      possibly lost: 0 bytes in 0 blocks
==3887==    still reachable: 0 bytes in 0 blocks
==3887==         suppressed: 0 bytes in 0 blocks
==3887== Rerun with --leak-check=full to see details of leaked memory
==3887== 
==3887== For lists of detected and suppressed errors, rerun with: -s
==3887== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==3886== 
==3886== HEAP SUMMARY:
==3886==     in use at exit: 24,841 bytes in 13 blocks
==3886==   total heap usage: 8,589 allocs, 8,576 frees, 5,762,486 bytes allocated
==3886== 
==3886== LEAK SUMMARY:
==3886==    definitely lost: 24,744 bytes in 7 blocks
==3886==    indirectly lost: 97 bytes in 6 blocks
==3886==      possibly lost: 0 bytes in 0 blocks
==3886==    still reachable: 0 bytes in 0 blocks
==3886==         suppressed: 0 bytes in 0 blocks
==3886== Rerun with --leak-check=full to see details of leaked memory
==3886== 
==3886== For lists of detected and suppressed errors, rerun with: -s
==3886== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

# runlevel
N 3

# rc-status
Runlevel: shutdown
...

No idea why is it in the openrc runlevel 'shutdown' without networking, sshd and every service from the 'default' runlevel.
Comment 12 Yarda 2022-03-28 18:26:28 UTC
I noticed my wtmp was over 16 MB, I cleared it and recreated but it didn't help. I can recognize 3 states happening randomly:
1) upon entering runlevel 3 it complains about invalid runlevel, it ends in the 'shutdown' openrc runlevel
2) upon entering runlevel 3 it doesn't complain, but it ends in the 'shutdown' openrc runlevel
3) it seems it boots OK, ended in the 'default' openrc runlevel, services are running and runlevel returns "N 3", but it's probably unsure about the current runlevel, because the 'reboot' command shutdowns the machine
Comment 13 Yarda 2022-03-28 23:41:50 UTC
(In reply to Yarda from comment #12)
> 3) it seems it boots OK, ended in the 'default' openrc runlevel, services
> are running and runlevel returns "N 3", but it's probably unsure about the
> current runlevel, because the 'reboot' command shutdowns the machine

The shutdown instead of the reboot happening in 3) is probably unrelated to this problem. In the original sysvinit inittab it's calling '/sbin/reboot -dkn', where the -k is downstream patched-in option for kexec which calls reboot(LINUX_REBOOT_CMD_KEXEC), from the 'man 2 reboot':

       LINUX_REBOOT_CMD_KEXEC
              (RB_KEXEC, 0x45584543, since Linux 2.6.13).  Execute a kernel that has been loaded earlier with kexec_load(2).   This
              option is available only if the kernel was configured with CONFIG_KEXEC.

It seems the patch doesn't call kexec_load(), although undocumented, I suppose it will boot the current kernel, but the main problem is that I don't have the 'CONFIG_KEXEC' in my kernel. I will try patching out the -k from my inittab and I suppose the reboot will start working. I will test it when I get physically to the machine later today.
Comment 14 Yarda 2022-03-28 23:50:06 UTC
(In reply to Yarda from comment #11)
> # cat /tmp/Mon*
> ==3886== Memcheck, a memory error detector
> ==3886== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==3886== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
> ==3886== Command: /sbin/openrc default
> ==3886== Parent PID: 3884
> ==3886== 
> ==3886== Warning: noted but unhandled ioctl 0x5441 with no size/direction
> hints.
> ==3886==    This could cause spurious value errors to appear.
> ==3886==    See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a
> proper wrapper.

This is probably the TIOCGPTPEER ioctl. Newer kernels expose the ioctl TIOCGPTPEER call to userspace which allows to safely allocate a file descriptor for a pty slave based solely on the master file descriptor.

This shouldn't cause the false negative report.
Comment 15 Yarda 2022-03-29 00:00:54 UTC
(In reply to Yarda from comment #8)
> Created attachment 768082 [details]
> openrc2 wrapper log

The intermediary NULLs can be unrelated to this problem, bad IO cache flush etc.

I will try rebuilding glibc, downgrade gcc, downgrade openrc/sysvinit. Instrument debug points into the openrc or boot it through the debugger (when I will be onsite). The random nature of this problem complicates debugging.
Comment 16 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-29 06:06:45 UTC
(In reply to Yarda from comment #10)
> ==105==ASan runtime does not come first in initial library list; you should
> either link runtime to your application or manually preload it with
> LD_PRELOAD.
> 

As a workaround, please temporarily compile with FEATURES="-sandbox -usersandbox" emerge -v1 ...

It should work then.
Comment 17 Yarda 2022-03-29 18:23:02 UTC
https://youtu.be/_6sWR4JQRaw

Unfortunately, I hadn't serial console. There are a lot of memory leaks which aren't probably related to this problem. Near the end (00:59) there is heap overflow error. It's hard to spot, because the LCD is quite slow, but it seems it's happening in the src/rc/rc.c:896 and src/librc/librc.c:487, probably the snprintf. I guess it's because there is some garbage in the runlevel and in the:

snprintf(path, sizeof(path), "%s/%s", RC_RUNLEVELDIR, runlevel);

there should be probaly sizeof(path) - 1 and the path should be explicitly initialized to 0 for the heap overflow not to happen, but it doesn't explain from where is the runlevel garbage coming.

There are also some minor service start errors which shouldn't be related to this problem.

Regarding the kexec reboot, I verified that without the '-k' the reboot is working OK. I will probably open upstream kernel bugzilla about it, because I think if the KEXEC is unsupported it should reboot not shutdown (I am pretty sure it worked this way some time ago).
Comment 18 Yarda 2022-03-29 18:25:50 UTC
I got the text log, I will post it in the next comment.
Comment 19 Yarda 2022-03-29 18:33:27 UTC
Created attachment 768164 [details]
Heap overflow error

This is the heap overflow error, the memory leaks weren't logged (the reason is currently unknown to me).
Comment 20 Yarda 2022-03-29 19:41:47 UTC
The garbage gets in on src/rc.c:875:
krunlevel = get_krunlevel();
Comment 21 Yarda 2022-03-29 20:20:55 UTC
I finally got it, the problem was caused by misconfiguration (what else? :) which was in place and working for more than 10 years :)

The machine has SSD and HDDs, the SSD is used only for speed critical dirs, the following was in the fstab:

...
/mnt/data/var		/var		none		bind		0 0
/mnt/data/run		/run		none		bind		0 0
...

And in the /mnt/data/run/openrc/krunlevel was 'shutdown'. The garbage was probably read if the race was hit during the remount of the FS over the tmpfs.

Sorry for noise.
Comment 22 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-30 01:54:44 UTC
(In reply to Yarda from comment #21)
> I finally got it, the problem was caused by misconfiguration (what else? :)
> which was in place and working for more than 10 years :)
> 
> The machine has SSD and HDDs, the SSD is used only for speed critical dirs,
> the following was in the fstab:
> 
> ...
> /mnt/data/var		/var		none		bind		0 0
> /mnt/data/run		/run		none		bind		0 0
> ...
> 
> And in the /mnt/data/run/openrc/krunlevel was 'shutdown'. The garbage was
> probably read if the race was hit during the remount of the FS over the
> tmpfs.
> 
> Sorry for noise.

Thanks for updating! FWIW, we really shouldn't ever overflow like that even if garbage is given.
Comment 23 Yarda 2022-03-30 08:27:34 UTC
(In reply to Sam James from comment #22)
> (In reply to Yarda from comment #21)
> > I finally got it, the problem was caused by misconfiguration (what else? :)
> > which was in place and working for more than 10 years :)
> > 
> > The machine has SSD and HDDs, the SSD is used only for speed critical dirs,
> > the following was in the fstab:
> > 
> > ...
> > /mnt/data/var		/var		none		bind		0 0
> > /mnt/data/run		/run		none		bind		0 0
> > ...
> > 
> > And in the /mnt/data/run/openrc/krunlevel was 'shutdown'. The garbage was
> > probably read if the race was hit during the remount of the FS over the
> > tmpfs.
> > 
> > Sorry for noise.
> 
> Thanks for updating! FWIW, we really shouldn't ever overflow like that even
> if garbage is given.

NP, there is probably also kernel (or maybe glibc) bug, because I think the read shouldn't return garbage upon bind remount. I will try to strip this down and I will probably also open upstream kernel bug.
Comment 24 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-31 05:40:33 UTC
Please keep us updated either way, as I'm definitely interested.

Sadly, I can't reproduce with my attempts so far to put junk in the krunlevel file, but I'll still see if the code looks right.
Comment 25 Yarda 2022-04-03 08:54:03 UTC
From the strace it doesn't seem to be kernel bug:
...
write(1, "newlevel1: .default.\n", 21)  = 21 # instrumented debug output showing content of the newlevel before get_krunlevel
rt_sigaction(SIGCHLD, {sa_handler=0x559791083bb0, sa_mask=[], sa_flags=SA_RESTORER, sa_restorer=0x7f35ecea2790}, NULL, 8) = 0
write(1, "krunlevel\n", 10)             = 10 # instrumented debug output showing we reached the get_krunlevel
newfstatat(AT_FDCWD, "/run/openrc/krunlevel", {st_dev=makedev(0x8, 0x15), st_ino=35782679, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1648943081 /* 2022-04-03T01:44:41.781000000+0200 */, st_atime_nsec=781000000, st_mtime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_mtime_nsec=604915136, st_ctime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_ctime_nsec=604915136}, 0) = 0
openat(AT_FDCWD, "/run/openrc/krunlevel", O_RDONLY) = 3
newfstatat(3, "", {st_dev=makedev(0x8, 0x15), st_ino=35782679, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1648943081 /* 2022-04-03T01:44:41.781000000+0200 */, st_atime_nsec=781000000, st_mtime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_mtime_nsec=604915136, st_ctime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_ctime_nsec=604915136}, AT_EMPTY_PATH) = 0
read(3, "", 4096)                       = 0 # empty string read, not garbage
close(3)                                = 0
newfstatat(AT_FDCWD, "/run/openrc/krunlevel", {st_dev=makedev(0x8, 0x15), st_ino=35782679, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=0, st_size=0, st_atime=1648943081 /* 2022-04-03T01:44:41.781000000+0200 */, st_atime_nsec=781000000, st_mtime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_mtime_nsec=604915136, st_ctime=1648943025 /* 2022-04-03T01:43:45.604915136+0200 */, st_ctime_nsec=604915136}, 0) = 0
unlink("/run/openrc/krunlevel")         = 0
write(1, "newlevel3: .\250v\264\313\222U.\n", 20) = 20 # instrumented debug output showing content of the newlevel after get_krunlevel and there is garbage now
...

Maybe it's glibc bug? I am going to focus on the getline.