Summary: | app-containers/docker locks /dev/null, causes other apps to hang forever | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Joakim Tjernlund <joakim.tjernlund> |
Component: | Current packages | Assignee: | William Hubbs <williamh> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | andy.dalton, gyakovlev, jstein, kripton, melser_regs, sam |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | https://bugs.gentoo.org/show_bug.cgi?id=764110 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: | emerge --info (not of the original reporter) |
Description
Joakim Tjernlund
2020-10-14 14:27:12 UTC
kmk from dev-util/kbuild-0.1.9998.3407 Ping? This is a real bother to deal with each kernel upgrade. Mind removing the /dev/null redirect ? I cannot reproduce this. Please post your emerge --info virtualbox-guest-additions output to this bug. emerge --info virtualbox-guest-additions Portage 3.0.8 (python 3.7.9-final-0, default/linux/amd64/17.0/desktop, gcc-9.3.0, glibc-2.31-r6, 5.4.72-x86_64 x86_64) ================================================================= System Settings ================================================================= System uname: Linux-5.4.72-x86_64-x86_64-Intel_Xeon_E3-12xx_v2_-Ivy_Bridge,_IBRS-with-gentoo-64 KiB Mem: 16395788 total, 495748 free KiB Swap: 4194300 total, 4194300 free Timestamp of repository gentoo: Tue, 20 Oct 2020 14:45:01 +0000 Head commit of repository gentoo: 66118dc9fd89ce367c32928fefd979d0540f2150 sh bash 5.0_p18 ld GNU ld (Gentoo 2.34 p6) 2.34.0 distcc 3.3.3 x86_64-pc-linux-gnu [disabled] ccache version 3.7.11 [disabled] app-shells/bash: 5.0_p18::gentoo dev-java/java-config: 2.3.1::gentoo dev-lang/perl: 5.30.3::gentoo dev-lang/python: 2.7.18-r4::gentoo, 3.7.9::gentoo dev-util/ccache: 3.7.11::gentoo dev-util/cmake: 3.17.4-r1::gentoo dev-util/pkgconfig: 0.29.2::gentoo sys-apps/baselayout: 2.7::gentoo sys-apps/openrc: 0.42.1::gentoo sys-apps/sandbox: 2.18::gentoo sys-devel/autoconf: 2.13-r1::gentoo, 2.69-r5::gentoo sys-devel/automake: 1.15.1-r2::gentoo, 1.16.1-r1::gentoo sys-devel/binutils: 2.34-r2::gentoo sys-devel/gcc: 8.4.0-r1::gentoo, 9.3.0-r1::gentoo sys-devel/gcc-config: 2.3.2::gentoo sys-devel/libtool: 2.4.6-r6::gentoo sys-devel/make: 4.2.1-r4::gentoo sys-kernel/linux-headers: 5.4-r1::gentoo (virtual/os-headers) sys-libs/glibc: 2.31-r6::gentoo Repositories: tmv3-cross-overlay location: /var/lib/layman/tmv3-cross-overlay sync-type: laymansync sync-uri: git://git.transmode.se/tmv3-cross-overlay.git priority: 50 gentoo location: /usr/portage sync-type: rsync sync-uri: rsync://devsrv.transmode.se/portage priority: 100 sync-rsync-verify-metamanifest: no sync-rsync-extra-opts: sync-rsync-verify-max-age: 24 sync-rsync-verify-jobs: 1 transmode location: /var/lib/layman/transmode sync-type: laymansync sync-uri: https://devsrv.transmode.se/svn/portage-overlay/stable masters: gentoo priority: 150 Installed sets: @cross-powerpc-toolchains, @infinera-desktop, @infinera-plasma ACCEPT_KEYWORDS="amd64" ACCEPT_LICENSE="* -@EULA FraunhoferFDK ms-teams-pre dlj-1.1 Oracle-BCLA-JavaSE AdobeFlash-11.x google-chrome" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/lib64/libreoffice/program/sofficerc /usr/share/config /usr/share/gnupg/qualified.txt /usr/share/maven-bin-3.6/conf /usr/share/sddm/scripts/Xsession" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/openconnect/csd-wrapper.sh /etc/php/apache2-php7.4/ext-active/ /etc/php/cgi-php7.4/ext-active/ /etc/php/cli-php7.4/ext-active/ /etc/portage/package.keywords/zzzautounmask /etc/portage/package.unmask/zzzautounmask /etc/portage/package.use/zzzautounmask /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /usr/share/dbus-1/services/org.kde.plasma.Notifications.service" CXXFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--verbose-conflicts --autounmask-write --autounmask --autounmask-continue --complete-graph=y --with-bdeps=y --quiet --jobs=3 --keep-going --usepkg --usepkg-exclude 'sys-firmware/hpuefi-mod app-emulation/virtualbox-modules app-emulation/virtualbox-guest-additions'" ENV_UNSET="CARGO_HOME DBUS_SESSION_BUS_ADDRESS DISPLAY GOBIN GOPATH PERL5LIB PERL5OPT PERLPREFIX PERL_CORE PERL_MB_OPT PERL_MM_OPT XAUTHORITY XDG_CACHE_HOME XDG_CONFIG_HOME XDG_DATA_HOME XDG_RUNTIME_DIR" FCFLAGS="-O2 -pipe" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs buildpkg config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" FFLAGS="-O2 -pipe" GENTOO_MIRRORS="http://devsrv.transmode.se/portage http://devsrv.transmode.se/portage/local http://ftp.df.lth.se/pub/gentoo" INSTALL_MASK=" /usr/share/dbus-1/services/org.kde.plasma.Notifications.service /usr/lib64/libnssckbi.so /usr/lib/libnssckbi.so /usr/lib64/pkcs11/gnome-keyring-pkcs11.so /usr/lib/pkcs11/gnome-keyring-pkcs11.so " LANG="en_GB.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" LINGUAS="en sv en_GB sv_SE en_US" MAKEOPTS="-s -j8" PKGDIR="/usr/portage/packages" PORTAGE_BINHOST="http://devsrv.transmode.se/portage/packages/gentoo64/" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git" PORTAGE_TMPDIR="/var/tmp" USE="X a52 aac accessibility acl acpi ads alsa amd64 bash-completion berkdb bluetooth branding bzip2 cairo caja cdda cdr clang cli client crypt cscope cups cxx dbus device-mapper dhcpcd dri dri3 dts dvd dvdr elogind emacs emboss encode exif faac ffmpeg flac fontconfig fortran gbm gdbm gif glamor gles gles2 glib gold gpm gssapi gstreamer gtk gtk3 gui hwaccel iconv icu idn introspection ipv6 java javafx jpeg kerberos kvm lcms ldap libglvnd libnotify libtirpc lvm mad mate mmx mng modemmanager mp3 mp4 mpeg mtp multilib natspec ncurses networkmanager nls nptl nscd ogg opengl openmp openssl opus p2p pam pango pcre pdf pdfimport pidgin pm-utils png policykit postproc ppds pulseaudio python qemu qt5 readline resolvconf rpc samba sasl script sdl seccomp secure-delete smi smp sound spell spice split-usr sqlite srt sse sse2 sse3 sse4 sse4_1 ssl ssse3 startup-notification subversion svg sync-plugin-portage system-bootstrap system-cairo system-harfbuzz system-icu system-jpeg system-libvpx system-llvm system-sqlite tcpd threads tiff truetype udev udisks unicode upower usb usbredir v4l vaapi vala vdpau vim virt-network virtualbox vnc vorbis vpx vulkan webdav wxwidgets x264 xattr xcb xft xinerama xml xpm xv xvid zlib" ABI_X86="64" ADA_TARGET="gnat_2018" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="karbon sheets words" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="aes avx f16c mmx mmxext pclmul popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock greis isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" GRUB_PLATFORMS="efi-64 pc qemu" INPUT_DEVICES="evdev libinput synaptics" KERNEL="linux" L10N="en sv en-GB sv-SE en-US" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LUA_SINGLE_TARGET="lua5-1" LUA_TARGETS="lua5-1" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-5" POSTGRES_TARGETS="postgres10 postgres11" PYTHON_SINGLE_TARGET="python3_7" PYTHON_TARGETS="python2_7 python3_7" QEMU_SOFTMMU_TARGETS="i386 x86_64 ppc arm aarch64" QEMU_USER_TARGETS="i386 x86_64 ppc arm aarch64" RUBY_TARGETS="ruby26" USERLAND="GNU" VIDEO_CARDS="intel i965 amdgpu vmware qxl fbdev vesa cirrus" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CC, CPPFLAGS, CTARGET, CXX, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS ================================================================= Package Settings ================================================================= app-emulation/virtualbox-guest-additions-6.1.14a::gentoo was built with the following: USE="X" ABI_X86="(64)" CFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error -fdebug-prefix-map=..=/var/tmp/portage/app-emulation/virtualbox-guest-additions-6.1.14a" CXXFLAGS="-O2 -pipe -Wno-deprecated-declarations -Wno-error -fdebug-prefix-map=..=/var/tmp/portage/app-emulation/virtualbox-guest-additions-6.1.14a" FEATURES="assume-digests binpkg-docompress binpkg-dostrip binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles ipc-sandbox merge-sync multilib-strict network-sandbox news parallel-fetch pkgdir-index-trusted preserve-libs protect-owned qa-unresolved-soname-deps sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr" Any ideas ? I guess you still have problems reproducing this ? Still think you could rm the /dev/null part regardless as it hides potential build problems from the user. (In reply to Joakim Tjernlund from comment #6) > I guess you still have problems reproducing this ? > > Still think you could rm the /dev/null part regardless as > it hides potential build problems from the user. Indeed, I still cannot reproduce it but saw another report about this in the Gentoo Forums. I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is way too verbose and just adds useless output to the build.log Any more pointers to the real source of this problem are highly welcome. (In reply to Lars Wendler (Polynomial-C) from comment #7) > > Indeed, I still cannot reproduce it but saw another report about this in the > Gentoo Forums. > I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is > way too verbose and just adds useless output to the build.log > > Any more pointers to the real source of this problem are highly welcome. In case you havent seen this already, kmk hangs on: fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1} i.e. stdout which is dev null. Kernel bug for /dev/null? (In reply to Joakim Tjernlund from comment #8) > (In reply to Lars Wendler (Polynomial-C) from comment #7) > > > > Indeed, I still cannot reproduce it but saw another report about this in the > > Gentoo Forums. > > I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is > > way too verbose and just adds useless output to the build.log > > > > Any more pointers to the real source of this problem are highly welcome. > > In case you havent seen this already, kmk hangs on: > fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1} > i.e. stdout which is dev null. > Kernel bug for /dev/null? I doubt it. I'm also on glibc-2.31-r6 running kernel-5.4.72 and I don't have the hangs. What are your permissions for /dev/null? ls -l /dev/null crw-rw-rw- 1 root root 1, 3 Oct 22 14:14 /dev/null (In reply to Lars Wendler (Polynomial-C) from comment #9) > (In reply to Joakim Tjernlund from comment #8) > > (In reply to Lars Wendler (Polynomial-C) from comment #7) > > > > > > Indeed, I still cannot reproduce it but saw another report about this in the > > > Gentoo Forums. > > > I'd like to keep the "&>/dev/null" snippet if possible as the kmk command is > > > way too verbose and just adds useless output to the build.log > > > > > > Any more pointers to the real source of this problem are highly welcome. > > > > In case you havent seen this already, kmk hangs on: > > fcntl(1, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=1} > > i.e. stdout which is dev null. > > Kernel bug for /dev/null? > > I doubt it. I'm also on glibc-2.31-r6 running kernel-5.4.72 and I don't have > the hangs. > What are your permissions for /dev/null? sandbox bug? # qlist -CIve sandbox sys-apps/sandbox-2.20 (In reply to Lars Wendler (Polynomial-C) from comment #12) > # qlist -CIve sandbox > sys-apps/sandbox-2.20 qlist -CIve sandbox sys-apps/sandbox-2.18 (In reply to Joakim Tjernlund from comment #13) > (In reply to Lars Wendler (Polynomial-C) from comment #12) > > # qlist -CIve sandbox > > sys-apps/sandbox-2.20 > > qlist -CIve sandbox > sys-apps/sandbox-2.18 sys-apps/sandbox-2.20 did not help for me I do have some pkgs X86_ABI="32 64" # > eix sys-apps/portage [I] sys-apps/portage Available versions: 2.3.99-r2 3.0.4-r1^t 3.0.8^t{tbz2} **9999*l^t {apidoc build doc gentoo-dev +ipc +native-extensions +rsync-verify selinux test xattr KERNEL="linux" PYTHON_TARGETS="pypy3 python3_6 python3_7 python3_8 python3_9"} Installed versions: 3.0.8^t{tbz2}(12:48:25 21/09/20)(ipc native-extensions xattr -apidoc -build -doc -gentoo-dev -rsync-verify -selinux -test KERNEL="linux" PYTHON_TARGETS="python3_7 -pypy3 -python3_6 -python3_8 -python3_9") Hi, I'm the other person experiencing this bug. My forum entry in gentoo was just linked to this. Wasn't aware of it before. Indeed also my emerge hangs in kmk during emerge. Would you like me to output some package versions or something to compare with the Joakims system? Hi, I'm experiencing the same hang in a virtual machine I'm currently updating after 2 months. emerge --info will be attached Created attachment 668951 [details]
emerge --info (not of the original reporter)
emerge --info from the machine where I also experience this problem
(In reply to jannis from comment #17) > Hi, I'm experiencing the same hang in a virtual machine I'm currently > updating after 2 months. emerge --info will be attached For me it is both VMs and physical machines, all built from the same gentoo img though. profile 17.0 vs 17.1 maybe? I and Jannis are still on 17.0 Found something: zcat /proc/config.gz | grep NULL # CONFIG_BLK_DEV_NULL_BLK is not set flipping that to on: CONFIG_BLK_DEV_NULL_BLK=y made kmk work Over to you to explain :) I'm seeing this problem as well. It looks like `kmk` is blocked trying
to lock `/dev/null` while something else is holding that lock:
```
$ sudo lslocks -u --output-all | grep kmk
kmk 1703 POSIX WRITE* 0 0 0 /dev/null -1
$ sudo lslocks -u --output-all | grep -- -1
kmk 1703 POSIX WRITE* 0 0 0 /dev/null -1
(undefined) -1 OFDLCK READ 0 0 0 /dev...
$
```
It looks something has an "Open File Description" lock, on `/dev/null`.
From `man lslocks`:
> Note that lslocks also lists OFD (Open File Description) locks, these
> locks are not associated with any process (PID is -1). OFD locks are
> associated with the open file description on which they are acquired.
After some additional searching, I found some posts where people were having trouble with dockerd creating OFDLCK locks in /dev. Sure enough, if I stop the docker daemon, the OFDLCK lock goes away. While dockerd is stopped, I can successfully emerge this package.
$ sudo emerge @module-rebuild
Calculating dependencies... done!
>>> Verifying ebuild manifests
>>> Emerging (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo
>>> Installing (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo
>>> Jobs: 1 of 1 complete Load avg: 2.58, 1.70, 0.73
* Messages for package app-emulation/virtualbox-guest-additions-6.0.24-r1:
...
So that might be a work-around for others hitting this problem.
I found an open bug against docker for this issue: https://github.com/moby/moby/issues/31182 That said, that bug has been open since 2017 with little activity. (In reply to Andy Dalton from comment #23) > After some additional searching, I found some posts where people were having > trouble with dockerd creating OFDLCK locks in /dev. Sure enough, if I stop > the docker daemon, the OFDLCK lock goes away. While dockerd is stopped, I > can successfully emerge this package. > > $ sudo emerge @module-rebuild > Calculating dependencies... done! > >>> Verifying ebuild manifests > >>> Emerging (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo > >>> Installing (1 of 1) app-emulation/virtualbox-guest-additions-6.0.24-r1::gentoo > >>> Jobs: 1 of 1 complete Load avg: 2.58, 1.70, 0.73 > > * Messages for package app-emulation/virtualbox-guest-additions-6.0.24-r1: > ... > > So that might be a work-around for others hitting this problem. Right, my previous kernel config trick did not work on my VMs but if I stop dockerd it works :) Same for me, stopping docker fixed the build. Could we add some warning message prior to emerge in that ebuild? (In reply to jannis from comment #26) > Same for me, stopping docker fixed the build. > Could we add some warning message prior to emerge in that ebuild? Just remove the >&/dev/null in virtualbox-guest-additions for now. it may be quite some time until docker is fixed. One could ask why kmk needs to lock stdout too ? (In reply to Andy Dalton from comment #22) > I'm seeing this problem as well. It looks like `kmk` is blocked trying > to lock `/dev/null` while something else is holding that lock: > > ``` > $ sudo lslocks -u --output-all | grep kmk > kmk 1703 POSIX WRITE* 0 0 0 /dev/null -1 > > > $ sudo lslocks -u --output-all | grep -- -1 > kmk 1703 POSIX WRITE* 0 0 0 /dev/null -1 > (undefined) -1 OFDLCK READ 0 0 0 /dev... These locks are of different MODEs(READ vs WRITE). Should READ locks block WRITE locks? (In reply to Joakim Tjernlund from comment #28) > These locks are of different MODEs(READ vs WRITE). Should READ > locks block WRITE locks? Yes, a READ lock would allow other concurrent READs but would block WRITEs. A WRITE lock would block both READs and WRITEs. (In reply to Andy Dalton from comment #29) > (In reply to Joakim Tjernlund from comment #28) > > These locks are of different MODEs(READ vs WRITE). Should READ > > locks block WRITE locks? > > Yes, a READ lock would allow other concurrent READs but would block WRITEs. > A WRITE lock would block both READs and WRITEs. OK, but it is a bit strange that problem started recently, we have benn build this for year with docker running. Did virtualbox-guest-additions gain the >&/dev/null recently ? (In reply to Joakim Tjernlund from comment #30) > (In reply to Andy Dalton from comment #29) > > (In reply to Joakim Tjernlund from comment #28) > > > These locks are of different MODEs(READ vs WRITE). Should READ > > > locks block WRITE locks? > > > > Yes, a READ lock would allow other concurrent READs but would block WRITEs. > > A WRITE lock would block both READs and WRITEs. > > OK, but it is a bit strange that problem started recently, we have benn > build this for year with docker running. > > Did virtualbox-guest-additions gain the >&/dev/null recently ? No, this has been around since commit 426405cb9a0da711324777e15a3b6c78ffa3bc24 Author: Lars Wendler <polynomial-c@gentoo.org> Date: Wed May 8 16:11:30 2019 app-emulation/virtualbox-guest-additions: Overhauled kernel mod build Package-Manager: Portage-2.3.66, Repoman-2.3.12 Signed-off-by: Lars Wendler <polynomial-c@gentoo.org> Perhaps it's a regression in dev-util/kbuild-0.1.9998.3407 (In reply to Lars Wendler (Polynomial-C) from comment #31) > (In reply to Joakim Tjernlund from comment #30) > > > > OK, but it is a bit strange that problem started recently, we have benn > > build this for year with docker running. > > > > Did virtualbox-guest-additions gain the >&/dev/null recently ? > > No, this has been around since > > commit 426405cb9a0da711324777e15a3b6c78ffa3bc24 > Author: Lars Wendler <polynomial-c@gentoo.org> > Date: Wed May 8 16:11:30 2019 > > app-emulation/virtualbox-guest-additions: Overhauled kernel mod build > > Package-Manager: Portage-2.3.66, Repoman-2.3.12 > Signed-off-by: Lars Wendler <polynomial-c@gentoo.org> > > > Perhaps it's a regression in dev-util/kbuild-0.1.9998.3407 Already tested 0.1.9998.3149 , no change Someone known GO? I suspect that the code below forgest to release the the lock after testing if it is available: func init() { // use open file descriptor locks if the system supports it getlk := syscall.Flock_t{Type: syscall.F_RDLCK} if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil { linuxTryLockFile = ofdTryLockFile linuxLockFile = ofdLockFile } } (In reply to Joakim Tjernlund from comment #33) > Someone known GO? I suspect that the code below forgest to release the > the lock after testing if it is available: > > func init() { > // use open file descriptor locks if the system supports it > getlk := syscall.Flock_t{Type: syscall.F_RDLCK} > if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil { > linuxTryLockFile = ofdTryLockFile > linuxLockFile = ofdLockFile > } > } If I comment out F_OFD_GETLK call and hardcode OFD locks, the error goes away. Not sure F_OFD_GETLK shoudl actuall create a lock here or just report if a lock is possible ? --- docker-ce/components/engine/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go 2020-10-31 16:33:59.855609583 +0100 +++ docker-ce/components/engine/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go 2020-10-31 16:34:13.607556963 +0100 @@ -48,11 +48,11 @@ func init() { // use open file descriptor locks if the system supports it - getlk := syscall.Flock_t{Type: syscall.F_RDLCK} - if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil { + //getlk := syscall.Flock_t{Type: syscall.F_RDLCK} + //if err := syscall.FcntlFlock(0, F_OFD_GETLK, &getlk); err == nil { linuxTryLockFile = ofdTryLockFile linuxLockFile = ofdLockFile - } + //} } func TryLockFile(path string, flag int, perm os.FileMode) (*LockedFile, error) { Actually the fix is much simpler: diff --git a/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go b/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go index 939fea6238..004d35fa23 100644 --- a/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go +++ b/vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go @@ -29,7 +29,7 @@ import ( // // constants from /usr/include/bits/fcntl-linux.h const ( - F_OFD_GETLK = 37 + F_OFD_GETLK = 36 F_OFD_SETLK = 37 F_OFD_SETLKW = 38 ) A fix has been merged into upstream, I guess it will be in next dockerd release. Fix for docker is in: https://github.com/etcd-io/etcd/pull/12444 Better add this fix to docker as docker seems slow to update its deps. docker-20.10.1 is now in the tree. Is this still an issue with that version of docker? cat ./vendor/github.com/coreos/etcd/pkg/fileutil/lock_linux.go // Copyright 2016 The etcd Authors // // Licensed under the Apache License, Version 2.0 (the "License"); // you may not use this file except in compliance with the License. // You may obtain a copy of the License at // // http://www.apache.org/licenses/LICENSE-2.0 // // Unless required by applicable law or agreed to in writing, software // distributed under the License is distributed on an "AS IS" BASIS, // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. // See the License for the specific language governing permissions and // limitations under the License. // +build linux package fileutil import ( "io" "os" "syscall" ) // This used to call syscall.Flock() but that call fails with EBADF on NFS. // An alternative is lockf() which works on NFS but that call lets a process lock // the same file twice. Instead, use Linux's non-standard open file descriptor // locks which will block if the process already holds the file lock. // // constants from /usr/include/bits/fcntl-linux.h const ( F_OFD_GETLK = 37 F_OFD_SETLK = 37 F_OFD_SETLKW = 38 ) Yes, F_OFD_GETLK is still same as F_OFD_SETLK There is now a draft pr upstream for this; apparently they are having issues with the new etcd not passing their ci. https://github.com/moby/moby/pull/41791 (In reply to William Hubbs from comment #42) > There is now a draft pr upstream for this; apparently they are having > issues with the new etcd not passing their ci. > > https://github.com/moby/moby/pull/41791 I know but they are moving way to slow, at this pace it will take another 3 years. I don't this it is too much to ask Gentoo adds a online fix until they get there. I'd like to add that this also happens in a chroot environment. So this is not really a docker issue, and "fixing" it there would be the wrong place. This makes updating our clients whose images are built in a chroot environment very annoying. Please remove the redirect! (In reply to fiesh from comment #44) > I'd like to add that this also happens in a chroot environment. So this is > not really a docker issue, and "fixing" it there would be the wrong place. > > This makes updating our clients whose images are built in a chroot > environment very annoying. Please remove the redirect! This makes no sense to me. There is a problem with docker creating a forever lock on /dev/null. If you have a problem with chroot that is another issue If something locks your /dev/null this is a problem that needs to be addressed. I am not going to remove the "&>/dev/null" redirect from app-emulation/virtualbox-guest-additions package clutterings the package's build logs with useless output just to circumvent a problem that should not exist in the first place. /dev/null should always be accessible. (In reply to William Hubbs from comment #42) > There is now a draft pr upstream for this; apparently they are having > issues with the new etcd not passing their ci. > > https://github.com/moby/moby/pull/41791 Willian, by now you have surely discovered that this issue is not moving forward upstream. Pleas add the one line fix to Gentoo package. Ping? Upstream isn't moving at all. docker-20.10.6 just hit the tree, still has the bug. Sigh The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=a86d23a290bba9f7c9135c181579c350086f2599 commit a86d23a290bba9f7c9135c181579c350086f2599 Author: Georgy Yakovlev <gyakovlev@gentoo.org> AuthorDate: 2021-04-22 07:44:44 +0000 Commit: Georgy Yakovlev <gyakovlev@gentoo.org> CommitDate: 2021-04-22 07:48:04 +0000 app-emulation/docker: add /dev/null patch to 20.10.6 Bug: https://bugs.gentoo.org/748984 Package-Manager: Portage-3.0.18, Repoman-3.0.3 Signed-off-by: Georgy Yakovlev <gyakovlev@gentoo.org> ...ker-20.10.6.ebuild => docker-20.10.6-r1.ebuild} | 7 ++++++ .../docker/files/etcd-F_OFD_GETLK-fix.patch | 28 ++++++++++++++++++++++ 2 files changed, 35 insertions(+) I've added the patch without bumping etcd version, as it's rather simple. fix is in docker-20.10.6-r1 (In reply to Georgy Yakovlev from comment #51) > I've added the patch without bumping etcd version, as it's rather simple. > > fix is in docker-20.10.6-r1 Thanks, upstream seems stuck on an old bundled etcd that does not receive releases anymore. still carrying patch in 20.10.7 but it's simple, so not a big deal. Appears fixed in docker 20.10.16 |