Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 748984

Summary: app-containers/docker locks /dev/null, causes other apps to hang forever
Product: Gentoo Linux Reporter: Joakim Tjernlund <joakim.tjernlund>
Component: Current packagesAssignee: William Hubbs <williamh>
Status: RESOLVED FIXED    
Severity: normal CC: andy.dalton, gyakovlev, jstein, kripton, melser_regs, sam
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=764110
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: emerge --info (not of the original reporter)

Description Joakim Tjernlund 2020-10-14 14:27:12 UTC
ebuild --skip-manifest --debug /usr/portage/app-emulation/virtualbox-guest-additions/virtualbox-guest-additions-6.1.14a.ebuild prepare
....

+ pushd src/VBox/Additions
+ ebegin 'Extracting guest kernel module sources'
+ local 'msg=Extracting guest kernel module sources' dots spaces=
+ [[ -n '' ]]
+ msg='Extracting guest kernel module sources ...'
+ einfon 'Extracting guest kernel module sources ...'
+ __elog_base INFO 'Extracting guest kernel module sources ...'
+ local messagetype
+ '[' -z INFO -o -z /var/tmp/portage/app-emulation/virtualbox-guest-additions-6.1.14a/temp -o '!' -d /var/tmp/portage/app-emulation/virtualbox-guest-additions-6.1.14a/temp/logging ']'
+ case "${1}" in
+ messagetype=INFO
+ shift
+ echo -e 'Extracting guest kernel module sources ...'
+ read -r
+ echo 'INFO Extracting guest kernel module sources ...'
+ read -r
+ return 0
+ [[ yes != \y\e\s ]]
+ echo -ne ' * Extracting guest kernel module sources ...'
 * Extracting guest kernel module sources ...+ LAST_E_CMD=einfon
+ return 0
+ [[ yes == \y\e\s ]]
+ echo

+ LAST_E_LEN=45
+ LAST_E_CMD=ebegin
+ return 0
+ kmk GuestDrivers-src vboxguest-src vboxsf-src vboxvideo-src
----- hangs here ----
strace -p 7109
strace: Process 7109 attached
fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}^Cstrace: Process 7109 detached
 <detached ...>


In ebuild you have:
 kmk GuestDrivers-src vboxguest-src vboxsf-src vboxvideo-src >&/dev/null || die

If I remove the null redirect it does not hang anymore:
  kmk GuestDrivers-src vboxguest-src vboxsf-src vboxvideo-src || die

kernel 5.4.70 and 5.4.71
Comment 1 Joakim Tjernlund 2020-10-14 14:29:35 UTC
kmk from dev-util/kbuild-0.1.9998.3407
Comment 2 Joakim Tjernlund 2020-10-20 11:08:18 UTC
Ping? This is a real bother to deal with each kernel upgrade.
Mind removing the /dev/null redirect ?
Comment 3 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2020-10-20 16:55:47 UTC
I cannot reproduce this. Please post your

  emerge --info virtualbox-guest-additions

output to this bug.
Comment 4 Joakim Tjernlund 2020-10-20 19:41:26 UTC
emerge --info virtualbox-guest-additions
Portage 3.0.8 (python 3.7.9-final-0, default/linux/amd64/17.0/desktop, gcc-9.3.0, glibc-2.31-r6, 5.4.72-x86_64 x86_64)
=================================================================
                         System Settings
=================================================================
System uname: Linux-5.4.72-x86_64-x86_64-Intel_Xeon_E3-12xx_v2_-Ivy_Bridge,_IBRS-with-gentoo-64
KiB Mem:    16395788 total,    495748 free
KiB Swap:    4194300 total,   4194300 free
Timestamp of repository gentoo: Tue, 20 Oct 2020 14:45:01 +0000
Head commit of repository gentoo: 66118dc9fd89ce367c32928fefd979d0540f2150
sh bash 5.0_p18
ld GNU ld (Gentoo 2.34 p6) 2.34.0
distcc 3.3.3 x86_64-pc-linux-gnu [disabled]
ccache version 3.7.11 [disabled]
app-shells/bash:          5.0_p18::gentoo
dev-java/java-config:     2.3.1::gentoo
dev-lang/perl:            5.30.3::gentoo
dev-lang/python:          2.7.18-r4::gentoo, 3.7.9::gentoo
dev-util/ccache:          3.7.11::gentoo
dev-util/cmake:           3.17.4-r1::gentoo
dev-util/pkgconfig:       0.29.2::gentoo
sys-apps/baselayout:      2.7::gentoo
sys-apps/openrc:          0.42.1::gentoo
sys-apps/sandbox:         2.18::gentoo
sys-devel/autoconf:       2.13-r1::gentoo, 2.69-r5::gentoo
sys-devel/automake:       1.15.1-r2::gentoo, 1.16.1-r1::gentoo
sys-devel/binutils:       2.34-r2::gentoo
sys-devel/gcc:            8.4.0-r1::gentoo, 9.3.0-r1::gentoo
sys-devel/gcc-config:     2.3.2::gentoo
sys-devel/libtool:        2.4.6-r6::gentoo
sys-devel/make:           4.2.1-r4::gentoo
sys-kernel/linux-headers: 5.4-r1::gentoo (virtual/os-headers)
sys-libs/glibc:           2.31-r6::gentoo
Repositories:

tmv3-cross-overlay
    location: /var/lib/layman/tmv3-cross-overlay
    sync-type: laymansync
    sync-uri: git://git.transmode.se/tmv3-cross-overlay.git
    priority: 50

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://devsrv.transmode.se/portage
    priority: 100
    sync-rsync-verify-metamanifest: no
    sync-rsync-extra-opts: 
    sync-rsync-verify-max-age: 24
    sync-rsync-verify-jobs: 1

transmode
    location: /var/lib/layman/transmode
    sync-type: laymansync
    sync-uri: https://devsrv.transmode.se/svn/portage-overlay/stable
    masters: gentoo
    priority: 150

Installed sets: @cross-powerpc-toolchains, @infinera-desktop, @infinera-plasma
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA FraunhoferFDK ms-teams-pre dlj-1.1 Oracle-BCLA-JavaSE AdobeFlash-11.x google-chrome"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.6/conf /usr/share/sddm/scripts/Xsession"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/openconnect/csd-wrapper.sh /etc/php/apache2-php7.4/ext-active/ /etc/php/cgi-php7.4/ext-active/ /etc/php/cli-php7.4/ext-active/ /etc/portage/package.keywords/zzzautounmask /etc/portage/package.unmask/zzzautounmask /etc/portage/package.use/zzzautounmask /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /usr/share/dbus-1/services/org.kde.plasma.Notifications.service"
CXXFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--verbose-conflicts --autounmask-write --autounmask  --autounmask-continue --complete-graph=y --with-bdeps=y --quiet --jobs=3 --keep-going --usepkg --usepkg-exclude 'sys-firmware/hpuefi-mod app-emulation/virtualbox-modules app-emulation/virtualbox-guest-additions'"
ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://devsrv.transmode.se/portage http://devsrv.transmode.se/portage/local http://ftp.df.lth.se/pub/gentoo"
INSTALL_MASK=" /usr/share/dbus-1/services/org.kde.plasma.Notifications.service  /usr/lib64/libnssckbi.so /usr/lib/libnssckbi.so  /usr/lib64/pkcs11/gnome-keyring-pkcs11.so /usr/lib/pkcs11/gnome-keyring-pkcs11.so "
LANG="en_GB.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="en sv en_GB sv_SE en_US"
MAKEOPTS="-s -j8"
PKGDIR="/usr/portage/packages"
PORTAGE_BINHOST="http://devsrv.transmode.se/portage/packages/gentoo64/"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="X a52 aac accessibility acl acpi ads alsa amd64 bash-completion berkdb bluetooth branding bzip2 cairo caja cdda cdr clang cli client crypt cscope cups cxx dbus device-mapper dhcpcd dri dri3 dts dvd dvdr elogind emacs emboss encode exif faac ffmpeg flac fontconfig fortran gbm gdbm gif glamor gles gles2 glib gold gpm gssapi gstreamer gtk gtk3 gui hwaccel iconv icu idn introspection ipv6 java javafx jpeg kerberos kvm lcms ldap libglvnd libnotify libtirpc lvm mad mate mmx mng modemmanager mp3 mp4 mpeg mtp multilib natspec ncurses networkmanager nls nptl nscd ogg opengl openmp openssl opus p2p pam pango pcre pdf pdfimport pidgin pm-utils png policykit postproc ppds pulseaudio python qemu qt5 readline resolvconf rpc samba sasl script sdl seccomp secure-delete smi smp sound spell spice split-usr sqlite srt sse sse2 sse3 sse4 sse4_1 ssl ssse3 startup-notification subversion svg sync-plugin-portage system-bootstrap system-cairo system-harfbuzz system-icu system-jpeg system-libvpx system-llvm system-sqlite tcpd threads tiff truetype udev udisks unicode upower usb usbredir v4l vaapi vala vdpau vim virt-network virtualbox vnc vorbis vpx vulkan webdav wxwidgets x264 xattr xcb xft xinerama xml xpm xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx f16c mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64 pc qemu" INPUT_DEVICES="evdev libinput synaptics" KERNEL="linux" L10N="en sv en-GB sv-SE en-US" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_7" PYTHON_TARGETS="python2_7 python3_7" QEMU_SOFTMMU_TARGETS="i386 x86_64 ppc arm aarch64" QEMU_USER_TARGETS="i386 x86_64 ppc arm aarch64" RUBY_TARGETS="ruby26" USERLAND="GNU" VIDEO_CARDS="intel i965 amdgpu vmware qxl fbdev vesa cirrus" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

=================================================================
                        Package Settings
=================================================================

app-emulation/virtualbox-guest-additions-6.1.14a::gentoo was built with the following:
USE="X" ABI_X86="(64)"
CFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error -fdebug-prefix-map=..=/var/tmp/portage/app-emulation/virtualbox-guest-additions-6.1.14a"
CXXFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error -fdebug-prefix-map=..=/var/tmp/portage/app-emulation/virtualbox-guest-additions-6.1.14a"
FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
Comment 5 Joakim Tjernlund 2020-10-22 22:21:34 UTC
Any ideas ?
Comment 6 Joakim Tjernlund 2020-10-27 17:57:00 UTC
I guess you still have problems reproducing this ?

Still think you could rm the /dev/null part regardless as
it hides potential build problems from the user.
Comment 7 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2020-10-27 21:17:32 UTC
(In reply to Joakim Tjernlund from comment #6)
> I guess you still have problems reproducing this ?
> 
> Still think you could rm the /dev/null part regardless as
> it hides potential build problems from the user.

Indeed, I still cannot reproduce it but saw another report about this in the Gentoo Forums.
I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is way too verbose and just adds useless output to the build.log

Any more pointers to the real source of this problem are highly welcome.
Comment 8 Joakim Tjernlund 2020-10-27 21:31:01 UTC
(In reply to Lars Wendler (Polynomial-C) from comment #7)
> 
> Indeed, I still cannot reproduce it but saw another report about this in the
> Gentoo Forums.
> I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is
> way too verbose and just adds useless output to the build.log
> 
> Any more pointers to the real source of this problem are highly welcome.

In case you havent seen this already, kmk hangs on:
  fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}
i.e. stdout which is dev null.
Kernel bug for /dev/null?
Comment 9 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2020-10-27 21:41:58 UTC
(In reply to Joakim Tjernlund from comment #8)
> (In reply to Lars Wendler (Polynomial-C) from comment #7)
> > 
> > Indeed, I still cannot reproduce it but saw another report about this in the
> > Gentoo Forums.
> > I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is
> > way too verbose and just adds useless output to the build.log
> > 
> > Any more pointers to the real source of this problem are highly welcome.
> 
> In case you havent seen this already, kmk hangs on:
>   fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}
> i.e. stdout which is dev null.
> Kernel bug for /dev/null?

I doubt it. I'm also on glibc-2.31-r6 running kernel-5.4.72 and I don't have the hangs.
What are your permissions for /dev/null?
Comment 10 Joakim Tjernlund 2020-10-27 21:45:21 UTC
ls -l /dev/null 
crw-rw-rw- 1 root root 1, 3 Oct 22 14:14 /dev/null
Comment 11 Joakim Tjernlund 2020-10-27 21:46:10 UTC
(In reply to Lars Wendler (Polynomial-C) from comment #9)
> (In reply to Joakim Tjernlund from comment #8)
> > (In reply to Lars Wendler (Polynomial-C) from comment #7)
> > > 
> > > Indeed, I still cannot reproduce it but saw another report about this in the
> > > Gentoo Forums.
> > > I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is
> > > way too verbose and just adds useless output to the build.log
> > > 
> > > Any more pointers to the real source of this problem are highly welcome.
> > 
> > In case you havent seen this already, kmk hangs on:
> >   fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1}
> > i.e. stdout which is dev null.
> > Kernel bug for /dev/null?
> 
> I doubt it. I'm also on glibc-2.31-r6 running kernel-5.4.72 and I don't have
> the hangs.
> What are your permissions for /dev/null?

sandbox bug?
Comment 12 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2020-10-27 21:48:44 UTC
# qlist -CIve sandbox
sys-apps/sandbox-2.20
Comment 13 Joakim Tjernlund 2020-10-27 21:57:31 UTC
(In reply to Lars Wendler (Polynomial-C) from comment #12)
> # qlist -CIve sandbox
> sys-apps/sandbox-2.20

qlist -CIve sandbox
sys-apps/sandbox-2.18
Comment 14 Joakim Tjernlund 2020-10-27 22:03:14 UTC
(In reply to Joakim Tjernlund from comment #13)
> (In reply to Lars Wendler (Polynomial-C) from comment #12)
> > # qlist -CIve sandbox
> > sys-apps/sandbox-2.20
> 
> qlist -CIve sandbox
> sys-apps/sandbox-2.18

sys-apps/sandbox-2.20 did not help for me
I do have some pkgs X86_ABI="32 64"
Comment 15 Joakim Tjernlund 2020-10-27 22:10:16 UTC
# > eix sys-apps/portage

[I] sys-apps/portage
     Available versions:  2.3.99-r2 3.0.4-r1^t 3.0.8^t{tbz2} **9999*l^t {apidoc build doc gentoo-dev +ipc +native-extensions +rsync-verify selinux test xattr KERNEL="linux" PYTHON_TARGETS="pypy3 python3_6 python3_7 python3_8 python3_9"}
     Installed versions:  3.0.8^t{tbz2}(12:48:25 21/09/20)(ipc native-extensions xattr -apidoc -build -doc -gentoo-dev -rsync-verify -selinux -test KERNEL="linux" PYTHON_TARGETS="python3_7 -pypy3 -python3_6 -python3_8 -python3_9")
Comment 16 Marc Elser 2020-10-28 06:15:57 UTC
Hi,

I'm the other person experiencing this bug. My forum entry in gentoo was just linked to this. Wasn't aware of it before. Indeed also my emerge hangs in kmk during emerge.

Would you like me to output some package versions or something to compare with the Joakims system?
Comment 17 jannis 2020-10-28 11:45:12 UTC
Hi, I'm experiencing the same hang in a virtual machine I'm currently updating after 2 months. emerge --info will be attached
Comment 18 jannis 2020-10-28 11:48:48 UTC
Created attachment 668951 [details]
emerge --info (not of the original reporter)

emerge --info from the machine where I also experience this problem
Comment 19 Joakim Tjernlund 2020-10-28 21:33:16 UTC
(In reply to jannis from comment #17)
> Hi, I'm experiencing the same hang in a virtual machine I'm currently
> updating after 2 months. emerge --info will be attached

For me it is both VMs and physical machines, all built from the same gentoo img though.
Comment 20 Joakim Tjernlund 2020-10-28 21:47:56 UTC
profile 17.0 vs 17.1 maybe?
I and Jannis are still on 17.0
Comment 21 Joakim Tjernlund 2020-10-28 22:40:08 UTC
Found something:
zcat /proc/config.gz | grep NULL
# CONFIG_BLK_DEV_NULL_BLK is not set

flipping that to on:
CONFIG_BLK_DEV_NULL_BLK=y
made kmk work

Over to you to explain :)
Comment 22 Andy Dalton 2020-10-28 23:07:19 UTC
I'm seeing this problem as well.  It looks like `kmk` is blocked trying
to lock `/dev/null` while something else is holding that lock:

```
$ sudo lslocks -u --output-all | grep kmk
kmk              1703  POSIX      WRITE* 0     0   0 /dev/null      -1


$ sudo lslocks -u --output-all | grep -- -1
kmk              1703  POSIX      WRITE* 0     0   0 /dev/null      -1
(undefined)        -1 OFDLCK      READ   0     0   0 /dev...
$
```

It looks something has an "Open File Description" lock, on `/dev/null`.
From `man lslocks`:

> Note  that lslocks also lists OFD (Open File Description) locks, these
> locks are not associated with any process (PID is -1).  OFD locks are
> associated with the open file description on which they are acquired.
Comment 23 Andy Dalton 2020-10-28 23:22:59 UTC
After some additional searching, I found some posts where people were having trouble with dockerd creating OFDLCK locks in /dev.  Sure enough, if I stop the docker daemon, the OFDLCK lock goes away.  While dockerd is stopped, I can successfully emerge this package.

$ sudo emerge @module-rebuild
Calculating dependencies... done!
>>> Verifying ebuild manifests
>>> Emerging (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo
>>> Installing (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo
>>> Jobs: 1 of 1 complete                           Load avg: 2.58, 1.70, 0.73

 * Messages for package app-emulation/virtualbox-guest-additions-6.0.24-r1:
...

So that might be a work-around for others hitting this problem.
Comment 24 Andy Dalton 2020-10-28 23:36:59 UTC
I found an open bug against docker for this issue:

https://github.com/moby/moby/issues/31182

That said, that bug has been open since 2017 with little activity.
Comment 25 Joakim Tjernlund 2020-10-28 23:50:00 UTC
(In reply to Andy Dalton from comment #23)
> After some additional searching, I found some posts where people were having
> trouble with dockerd creating OFDLCK locks in /dev.  Sure enough, if I stop
> the docker daemon, the OFDLCK lock goes away.  While dockerd is stopped, I
> can successfully emerge this package.
> 
> $ sudo emerge @module-rebuild
> Calculating dependencies... done!
> >>> Verifying ebuild manifests
> >>> Emerging (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo
> >>> Installing (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo
> >>> Jobs: 1 of 1 complete                           Load avg: 2.58, 1.70, 0.73
> 
>  * Messages for package app-emulation/virtualbox-guest-additions-6.0.24-r1:
> ...
> 
> So that might be a work-around for others hitting this problem.

Right, my previous kernel config trick did not work on my VMs but
if I stop dockerd it works :)
Comment 26 jannis 2020-10-29 06:50:55 UTC
Same for me, stopping docker fixed the build.
Could we add some warning message prior to emerge in that ebuild?
Comment 27 Joakim Tjernlund 2020-10-29 09:29:32 UTC
(In reply to jannis from comment #26)
> Same for me, stopping docker fixed the build.
> Could we add some warning message prior to emerge in that ebuild?

Just remove the >&/dev/null in virtualbox-guest-additions for now. it
may be quite some time until docker is fixed.

One could ask why kmk needs to lock stdout too ?
Comment 28 Joakim Tjernlund 2020-10-29 13:27:12 UTC
(In reply to Andy Dalton from comment #22)
> I'm seeing this problem as well.  It looks like `kmk` is blocked trying
> to lock `/dev/null` while something else is holding that lock:
> 
> ```
> $ sudo lslocks -u --output-all | grep kmk
> kmk              1703  POSIX      WRITE* 0     0   0 /dev/null      -1
> 
> 
> $ sudo lslocks -u --output-all | grep -- -1
> kmk              1703  POSIX      WRITE* 0     0   0 /dev/null      -1
> (undefined)        -1 OFDLCK      READ   0     0   0 /dev...

These locks are of different MODEs(READ vs WRITE). Should READ
locks block WRITE locks?
Comment 29 Andy Dalton 2020-10-29 13:57:06 UTC
(In reply to Joakim Tjernlund from comment #28)
> These locks are of different MODEs(READ vs WRITE). Should READ
> locks block WRITE locks?

Yes, a READ lock would allow other concurrent READs but would block WRITEs.  A WRITE lock would block both READs and WRITEs.
Comment 30 Joakim Tjernlund 2020-10-29 14:32:16 UTC
(In reply to Andy Dalton from comment #29)
> (In reply to Joakim Tjernlund from comment #28)
> > These locks are of different MODEs(READ vs WRITE). Should READ
> > locks block WRITE locks?
> 
> Yes, a READ lock would allow other concurrent READs but would block WRITEs. 
> A WRITE lock would block both READs and WRITEs.

OK, but it is a bit strange that problem started recently, we have benn build this for year with docker running.

Did virtualbox-guest-additions gain the >&/dev/null recently ?
Comment 31 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2020-10-30 08:31:57 UTC
(In reply to Joakim Tjernlund from comment #30)
> (In reply to Andy Dalton from comment #29)
> > (In reply to Joakim Tjernlund from comment #28)
> > > These locks are of different MODEs(READ vs WRITE). Should READ
> > > locks block WRITE locks?
> > 
> > Yes, a READ lock would allow other concurrent READs but would block WRITEs. 
> > A WRITE lock would block both READs and WRITEs.
> 
> OK, but it is a bit strange that problem started recently, we have benn
> build this for year with docker running.
> 
> Did virtualbox-guest-additions gain the >&/dev/null recently ?

No, this has been around since 

commit 426405cb9a0da711324777e15a3b6c78ffa3bc24
Author: Lars Wendler <polynomial-c@gentoo.org>
Date:   Wed May 8 16:11:30 2019

    app-emulation/virtualbox-guest-additions: Overhauled kernel mod build

    Package-Manager: Portage-2.3.66, Repoman-2.3.12
    Signed-off-by: Lars Wendler <polynomial-c@gentoo.org>


Perhaps it's a regression in dev-util/kbuild-0.1.9998.3407
Comment 32 Joakim Tjernlund 2020-10-30 11:30:14 UTC
(In reply to Lars Wendler (Polynomial-C) from comment #31)
> (In reply to Joakim Tjernlund from comment #30)
> > 
> > OK, but it is a bit strange that problem started recently, we have benn
> > build this for year with docker running.
> > 
> > Did virtualbox-guest-additions gain the >&/dev/null recently ?
> 
> No, this has been around since 
> 
> commit 426405cb9a0da711324777e15a3b6c78ffa3bc24
> Author: Lars Wendler <polynomial-c@gentoo.org>
> Date:   Wed May 8 16:11:30 2019
> 
>     app-emulation/virtualbox-guest-additions: Overhauled kernel mod build
> 
>     Package-Manager: Portage-2.3.66, Repoman-2.3.12
>     Signed-off-by: Lars Wendler <polynomial-c@gentoo.org>
> 
> 
> Perhaps it's a regression in dev-util/kbuild-0.1.9998.3407

Already tested 0.1.9998.3149 , no change
Comment 33 Joakim Tjernlund 2020-10-31 14:54:29 UTC
Someone known GO? I suspect that the code below forgest to release the
the lock after testing if it is available:

func init() {
	// use open file descriptor locks if the system supports it
	getlk := syscall.Flock_t{Type: syscall.F_RDLCK}
	if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil {
		linuxTryLockFile = ofdTryLockFile
		linuxLockFile = ofdLockFile
	}
}
Comment 34 Joakim Tjernlund 2020-10-31 15:48:43 UTC
(In reply to Joakim Tjernlund from comment #33)
> Someone known GO? I suspect that the code below forgest to release the
> the lock after testing if it is available:
> 
> func init() {
> 	// use open file descriptor locks if the system supports it
> 	getlk := syscall.Flock_t{Type: syscall.F_RDLCK}
> 	if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil {
> 		linuxTryLockFile = ofdTryLockFile
> 		linuxLockFile = ofdLockFile
> 	}
> }

If I comment out F_OFD_GETLK call and hardcode OFD locks, the error goes away.
Not sure F_OFD_GETLK shoudl actuall create a lock here or just report if
a lock is possible ?
Comment 35 Joakim Tjernlund 2020-10-31 15:49:15 UTC
--- docker-ce/components/engine/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go	2020-10-31 16:33:59.855609583 +0100
+++ docker-ce/components/engine/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go	2020-10-31 16:34:13.607556963 +0100
@@ -48,11 +48,11 @@
 
 func init() {
 	// use open file descriptor locks if the system supports it
-	getlk := syscall.Flock_t{Type: syscall.F_RDLCK}
-	if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil {
+	//getlk := syscall.Flock_t{Type: syscall.F_RDLCK}
+	//if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil {
 		linuxTryLockFile = ofdTryLockFile
 		linuxLockFile = ofdLockFile
-	}
+	//}
 }
 
 func TryLockFile(path string, flag int, perm os.FileMode) (*LockedFile, error) {
Comment 36 Joakim Tjernlund 2020-11-01 20:58:37 UTC
Actually the fix is much simpler:
diff --git a/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go b/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go
index 939fea6238..004d35fa23 100644
--- a/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go
+++ b/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go
@@ -29,7 +29,7 @@ import (
 //
 // constants from /usr/include/bits/fcntl-linux.h
 const (
-       F_OFD_GETLK  = 37
+       F_OFD_GETLK  = 36
        F_OFD_SETLK  = 37
        F_OFD_SETLKW = 38
 )
Comment 37 Joakim Tjernlund 2020-11-04 11:11:14 UTC
A fix has been merged into upstream, I guess it will be in next dockerd release.
Comment 38 Joakim Tjernlund 2020-11-14 13:43:37 UTC
Fix for docker is in:
https://github.com/etcd-io/etcd/pull/12444
Comment 39 Joakim Tjernlund 2020-12-30 12:45:20 UTC
Better add this fix to docker as docker seems slow to update its deps.
Comment 40 William Hubbs gentoo-dev 2021-01-06 00:35:26 UTC
docker-20.10.1 is now in the tree. Is this still an issue with that
version of docker?
Comment 41 Joakim Tjernlund 2021-01-06 02:10:51 UTC
cat ./vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go


// Copyright 2016 The etcd Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
//     http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// +build linux

package fileutil

import (
	"io"
	"os"
	"syscall"
)

// This used to call syscall.Flock() but that call fails with EBADF on NFS.
// An alternative is lockf() which works on NFS but that call lets a process lock
// the same file twice. Instead, use Linux's non-standard open file descriptor
// locks which will block if the process already holds the file lock.
//
// constants from /usr/include/bits/fcntl-linux.h
const (
	F_OFD_GETLK  = 37
	F_OFD_SETLK  = 37
	F_OFD_SETLKW = 38
)

Yes, F_OFD_GETLK is still same as F_OFD_SETLK
Comment 42 William Hubbs gentoo-dev 2021-01-09 22:06:46 UTC
There is now a draft pr upstream for this; apparently they are having
issues with the new etcd not passing their ci.

https://github.com/moby/moby/pull/41791
Comment 43 Joakim Tjernlund 2021-01-09 22:25:49 UTC
(In reply to William Hubbs from comment #42)
> There is now a draft pr upstream for this; apparently they are having
> issues with the new etcd not passing their ci.
> 
> https://github.com/moby/moby/pull/41791

I know but they are moving way to slow, at this pace it will take another 3 years.
I don't this it is too much to ask Gentoo adds a online fix until they get there.
Comment 44 fiesh 2021-01-11 07:30:01 UTC
I'd like to add that this also happens in a chroot environment.  So this is not really a docker issue, and "fixing" it there would be the wrong place.

This makes updating our clients whose images are built in a chroot environment very annoying.  Please remove the redirect!
Comment 45 Joakim Tjernlund 2021-01-11 09:57:31 UTC
(In reply to fiesh from comment #44)
> I'd like to add that this also happens in a chroot environment.  So this is
> not really a docker issue, and "fixing" it there would be the wrong place.
> 
> This makes updating our clients whose images are built in a chroot
> environment very annoying.  Please remove the redirect!

This makes no sense to me. There is a problem with docker creating a forever lock on /dev/null.
If you have a problem with chroot that is another issue
Comment 46 Lars Wendler (Polynomial-C) (RETIRED) gentoo-dev 2021-01-11 10:12:45 UTC
If something locks your /dev/null this is a problem that needs to be addressed. I am not going to remove the "&>/dev/null" redirect from app-emulation/virtualbox-guest-additions package clutterings the package's build logs with useless output just to circumvent a problem that should not exist in the first place. /dev/null should always be accessible.
Comment 47 Joakim Tjernlund 2021-01-12 12:28:03 UTC
(In reply to William Hubbs from comment #42)
> There is now a draft pr upstream for this; apparently they are having
> issues with the new etcd not passing their ci.
> 
> https://github.com/moby/moby/pull/41791

Willian, by now you have surely discovered that this issue is not moving forward
upstream. Pleas add the one line fix to Gentoo package.
Comment 48 Joakim Tjernlund 2021-03-08 21:01:39 UTC
Ping? Upstream isn't moving at all.
Comment 49 Joakim Tjernlund 2021-04-19 16:15:41 UTC
docker-20.10.6 just hit the tree, still has the bug. Sigh
Comment 50 Larry the Git Cow gentoo-dev 2021-04-22 07:48:31 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a86d23a290bba9f7c9135c181579c350086f2599

commit a86d23a290bba9f7c9135c181579c350086f2599
Author:     Georgy Yakovlev <gyakovlev@gentoo.org>
AuthorDate: 2021-04-22 07:44:44 +0000
Commit:     Georgy Yakovlev <gyakovlev@gentoo.org>
CommitDate: 2021-04-22 07:48:04 +0000

    app-emulation/docker: add /dev/null patch to 20.10.6
    
    Bug: https://bugs.gentoo.org/748984
    Package-Manager: Portage-3.0.18, Repoman-3.0.3
    Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org>

 ...ker-20.10.6.ebuild => docker-20.10.6-r1.ebuild} |  7 ++++++
 .../docker/files/etcd-F_OFD_GETLK-fix.patch        | 28 ++++++++++++++++++++++
 2 files changed, 35 insertions(+)
Comment 51 Georgy Yakovlev archtester gentoo-dev 2021-04-22 07:49:53 UTC
I've added the patch without bumping etcd version, as it's rather simple.

fix is in docker-20.10.6-r1
Comment 52 Joakim Tjernlund 2021-04-22 10:06:03 UTC
(In reply to Georgy Yakovlev from comment #51)
> I've added the patch without bumping etcd version, as it's rather simple.
> 
> fix is in docker-20.10.6-r1

Thanks, upstream seems stuck on an old bundled etcd that does not receive
releases anymore.
Comment 53 Georgy Yakovlev archtester gentoo-dev 2021-06-14 00:40:39 UTC
still carrying patch in 20.10.7

but it's simple, so not a big deal.
Comment 54 Joakim Tjernlund 2022-05-17 15:07:41 UTC
Appears fixed in docker 20.10.16