Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 611114 - sys-apps/portage-2.3.3: FEATURE="userpriv" causes portage to fail inside privileged LXC container
Summary: sys-apps/portage-2.3.3: FEATURE="userpriv" causes portage to fail inside priv...
Status: RESOLVED WONTFIX
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Unclassified (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-02-27 14:20 UTC by i.Dark_Templar
Modified: 2017-03-08 08:22 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
full-log-1.txt (file_611114.txt,4.57 KB, text/plain)
2017-02-27 14:22 UTC, i.Dark_Templar
Details
full-log-2.txt (file_611114.txt,9.64 KB, text/plain)
2017-02-27 14:27 UTC, i.Dark_Templar
Details
feature-userpriv-chown.patch (feature-userpriv-chown.patch,2.38 KB, patch)
2017-02-27 14:30 UTC, i.Dark_Templar
Details | Diff
locks-chmod.patch (locks-chmod.patch,1.13 KB, patch)
2017-02-28 12:42 UTC, i.Dark_Templar
Details | Diff
minimal test case (test_bug_611114.py,424 bytes, text/x-python)
2017-03-01 06:16 UTC, Zac Medico
Details

Note You need to log in before you can comment on or make changes to this bug.
Description i.Dark_Templar 2017-02-27 14:20:49 UTC
I've set up lxc gentoo guest on gentoo host, tried to update system, and portage failed on unpack stage. I've checked features, and toggled them one by one until I've discovered that disabling "userpriv" feature fixes the issue.

Reproducible: Always

Steps to Reproduce:
1. emerge =app-emulation/lxc-2.0.7
2. lxc-create -n testbox -t gentoo
3. lxc-start -n testbox
4. lxc-attach -n testbox
5. inside the lxc: FEATURE="userpriv" emerge -1v portage
Actual Results:  
emerge fails on unpack stage

Expected Results:  
emerge should succeed

Emerge info on lxc gentoo guest:

testbox ~ # emerge --info
setlocale: unsupported locale setting
setlocale: unsupported locale setting
Portage 2.3.3 (python 3.4.5-final-0, default/linux/amd64/13.0, gcc-4.9.4, glibc-2.23-r3, 4.9.6-gentoo-r1.46 x86_64)
=================================================================
System uname: Linux-4.9.6-gentoo-r1.46-x86_64-Pentium-R-_Dual-Core_CPU_T4200_@_2.00GHz-with-gentoo-2.3
KiB Mem:     4049392 total,   3588296 free
KiB Swap:    4192252 total,   4192252 free
Timestamp of repository gentoo: Tue, 07 Feb 2017 00:45:01 +0000
sh bash 4.3_p48-r1
ld GNU ld (Gentoo 2.25.1 p1.1) 2.25.1
app-shells/bash:          4.3_p48-r1::gentoo
dev-lang/perl:            5.22.3_rc4::gentoo
dev-lang/python:          2.7.12::gentoo, 3.4.5::gentoo
dev-util/pkgconfig:       0.28-r2::gentoo
sys-apps/baselayout:      2.3::gentoo
sys-apps/openrc:          0.22.4::gentoo
sys-apps/sandbox:         2.10-r3::gentoo
sys-devel/autoconf:       2.69::gentoo
sys-devel/automake:       1.14.1::gentoo, 1.15::gentoo
sys-devel/binutils:       2.25.1-r1::gentoo
sys-devel/gcc:            4.9.4::gentoo
sys-devel/gcc-config:     1.7.3::gentoo
sys-devel/libtool:        2.4.6-r2::gentoo
sys-devel/make:           4.2.1::gentoo
sys-kernel/linux-headers: 4.4::gentoo (virtual/os-headers)
sys-libs/glibc:           2.23-r3::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000

ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LANG="ru_RU.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --exclude=/.git"
PORTAGE_TMPDIR="/var/tmp"
USE="acl amd64 berkdb bindist bzip2 cli cracklib crypt cxx dri fortran gdbm iconv ipv6 modules multilib ncurses nls nptl openmp pam pcre readline seccomp session ssl tcpd unicode xattr zlib" ABI_X86="64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="authn_core authz_core socache_shmcb unixd actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx sse sse2" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock isync itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf skytraq superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="libinput keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" PYTHON_SINGLE_TARGET="python2_7" PYTHON_TARGETS="python2_7 python3_4" RUBY_TARGETS="ruby21" USERLAND="GNU" VIDEO_CARDS="amdgpu fbdev intel nouveau radeon radeonsi vesa dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CC, CPPFLAGS, CTARGET, CXX, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, MAKEOPTS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON
Comment 1 i.Dark_Templar 2017-02-27 14:22:08 UTC
Created attachment 465408 [details]
full-log-1.txt

Basically, it's a log of failure. Important lines:

>>> Source unpacked in /var/tmp/portage/sys-apps/portage-2.3.3/work
Traceback (most recent call last):
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/pym/portage/locks.py", line 152, in lockfile
    myfd = os.open(lockfilename, os.O_CREAT|os.O_RDWR, 0o660)
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/pym/portage/__init__.py", line 250, in __call__
    rval = self._func(*wrapped_args, **wrapped_kwargs)
PermissionError: [Errno 13] Permission denied: b'/var/tmp/portage/sys-apps/.portage-2.3.3.portage_lockfile'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/bin/ebuild-ipc.py", line 282, in <module>
    sys.exit(ebuild_ipc_main(sys.argv[1:]))
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/bin/ebuild-ipc.py", line 279, in ebuild_ipc_main
    return ebuild_ipc.communicate(args)
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/bin/ebuild-ipc.py", line 139, in communicate
    return self._communicate(args)
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/bin/ebuild-ipc.py", line 245, in _communicate
    if not self._daemon_is_alive():
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/bin/ebuild-ipc.py", line 124, in _daemon_is_alive
    wantnewlockfile=True, flags=os.O_NONBLOCK)
  File "/var/tmp/portage/._portage_reinstall_.9vqnbny1/pym/portage/locks.py", line 158, in lockfile
    raise PermissionDenied(func_call)
portage.exception.PermissionDenied: open('/var/tmp/portage/sys-apps/.portage-2.3.3.portage_lockfile')
Comment 2 i.Dark_Templar 2017-02-27 14:27:04 UTC
Created attachment 465410 [details]
full-log-2.txt

I've patched locks.py in order to add some more info:

Important lines:
lockfile /var/tmp/portage/sys-apps/.portage-2.3.3.portage_lockfile, mode 66, perms 660, uid 250, gid 250
-rw-r----- 1 root portage 0 Feb 27 17:12 /var/tmp/portage/sys-apps/.portage-2.3.3.portage_lockfile

Basically, it tries to lock for uid portage a file owned by root and fails.

Here's the patch:
--- /usr/lib/python3.4/site-packages/portage/locks.py.back       2017-02-27 17:32:31.843193592 +0300
+++ /usr/lib/python3.4/site-packages/portage/locks.py      2017-02-27 17:09:53.169179355 +0300
@@ -149,6 +149,9 @@
                old_mask = os.umask(000)
                try:
                        try:
+                               buf = "lockfile %s, mode %d, perms %o, uid %d, gid %d" % ( lockfilename, os.O_CREAT|os.O_RDWR, 0o660, os.getuid(), os.getgid())
+                               print(buf)
+                               os.system("ls -la %s" % lockfilename)
                                myfd = os.open(lockfilename, os.O_CREAT|os.O_RDWR, 0o660)
                        except OSError as e:
                                func_call = "open('%s')" % lockfilename
@@ -325,6 +328,9 @@
        else:
                raise InvalidData
 
+       buf = "unlockfile %s, uid %d, gid %d" % ( lockfilename, os.getuid(), os.getgid())
+       print(buf)
+
        if(myfd == HARDLINK_FD):
                unhardlink_lockfile(lockfilename, unlinkfile=unlinkfile)
                return True
Comment 3 i.Dark_Templar 2017-02-27 14:30:19 UTC
Created attachment 465414 [details, diff]
feature-userpriv-chown.patch

I couldn't figure out why lock isn't freed by the time portage drops privileges, but I noticed that when lock is created portage changes group of file, but leaves user intact. This patch is more like a hack, not sure it's a correct fix for the issue, but it worked for me.
Comment 4 Zac Medico gentoo-dev 2017-02-27 16:06:59 UTC
What is the underlying filesystem type that you are using for /var/tmp/portage?
Comment 5 i.Dark_Templar 2017-02-27 16:42:52 UTC
It's ext3 in the guest. Here're filesystem options in case it's relevant:

Filesystem features:      has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags:         signed_directory_hash 
Default mount options:    journal_data_ordered user_xattr acl

I'm using tmpfs on host system. I've disabled it and tried using same ext3 fs, but issue didn't reproduce for me (I've made sure to disable my patch).
Comment 6 Zac Medico gentoo-dev 2017-02-27 20:20:02 UTC
Normally, it's possible to lock the file without changing the uid. It should not be necessary to change the uid, so I think there's something wrong with your configuration. We shouldn't change the code unless we have a very good explanation for the reasoning.
Comment 7 i.Dark_Templar 2017-02-28 12:14:22 UTC
Yes, I know that something is wrong, otherwise it would work. I can provide more info if you'd like to debug this issue. I think issue might be caused by lockfile permissions, lockfile owner uid/gid (which my patch updates) or lockfile not being released prior to dropping privileges.

As you can see, lockfile has permissions 0640 instead of 0660, and owner root, group portage.
If owner changed to portage, then opening lockfile works. If permissions of lockfile are changed to 0660, I guess it should work too.

As for the result of call:

myfd = os.open(lockfilename, os.O_CREAT|os.O_RDWR, 0o660)

I think there's some issue with umask interaction.

I've made a simple script and ran it on the host and the guest.
Here's the script:
#!/usr/bin/python

import os

os.umask(0o0002)
os.open("/tmp/testfile", os.O_CREAT | os.O_RDWR, 0o0660)
os.system("ls -la /tmp/testfile")


Here's the host:
linux ~ # rm /tmp/testfile
linux ~ # LC_ALL=C python /var/lib/lxc/testbox/rootfs/tmp/python-test.py 
-rw-rw---- 1 root root 0 Feb 28 15:03 /tmp/testfile

guest:
testbox ~ # rm /tmp/testfile 
testbox ~ # LC_ALL=C python /tmp/python-test.py 
-rw-r----- 1 root root 0 Feb 28 15:04 /tmp/testfile

Both host and guest have default umask 0022 in the /etc/login.defs.

I'll provide a patch to change permissions of lockfiles instead. It should fix the issue, but I'm still not sure what causes the issue with umask yet, but it appears in lxc containers for me
Comment 8 i.Dark_Templar 2017-02-28 12:42:23 UTC
Created attachment 465512 [details, diff]
locks-chmod.patch

Another patch for the issue which worked for me. This one makes sure to correct file permissions for lock files in case of issues with umask.
Comment 9 Zac Medico gentoo-dev 2017-02-28 16:52:15 UTC
It seems that the 022 umask set by emerge is masking out the write bit of the 0o660 specified in the locks.py os.open call. I'm not sure if calling chmod in locks.py is the best solution. If we do that, then there should be a mode parameter to the lockfile function, so that the caller can control it.
Comment 10 Zac Medico gentoo-dev 2017-02-28 17:40:17 UTC
Actually, the umask is temporarily changed here:

   old_mask = os.umask(000)

So we need an explanation for why that's not working. Otherwise, the patch looks pretty reasonable.
Comment 11 i.Dark_Templar 2017-02-28 18:07:47 UTC
Yes, umask is temporarily changed before creating a lock file, but for some reason it doesn't have an effect inside an lxc container for me. I didn't figure out the reason yet.
Comment 12 Zac Medico gentoo-dev 2017-03-01 06:16:22 UTC
Created attachment 465620 [details]
minimal test case

Here's a minimal test case that hopefully you can use to reproduce the behavior. If that reproduces it, then you can use it to file an issue here:

    https://github.com/lxc/lxc/issues
Comment 13 i.Dark_Templar 2017-03-02 12:12:16 UTC
Opened a bug on lxc issues tracker: https://github.com/lxc/lxc/issues/1448
Comment 14 i.Dark_Templar 2017-03-08 08:07:00 UTC
I've finally found the root of issue. I'll duplicate my comment from LXC issue tracker here. It was related to ACL. The box I ran had default acl on every directory like this:

$ getfacl /
getfacl: Removing leading '/' from absolute path names
# file: .
# owner: root
# group: root
user::rwx
group::r-x
other::r-x
default:user::rwx
default:group:r-x
default:other:r-x

I've compared this setup with my other boxes (and virtualbox test setup) and didn't find similar ACLs on other boxes. After that I remounted root system with 'noacl' and issue gone. I didn't notice this issue on host system since directories /tmp and /var/tmp/portage were mounted as tmpfs there.

I've booted to recovery image, ran 'setfacl -kR ...' on every filesystem since I didn't need these ACL, booted to system and confirmed that issue disappeared.

Now I know the missing step to reproduction (NOTE, don't do if you have some ACL set up, either on host, or in any client which has filesystem located in /var/lib/lxc, or save it before doing it):
setfacl -dR --set=u::rwx,g::rx,o::rx /var/lib/lxc
To remove this ACL later run (same NOTE as above):
setfacl -kR /var/lib/lxc

In order to test the issue on host box, use '/tmp' instead of '/var/lib/lxc' in the commands above.

I've fixed my setup. Please close bug if you think portage shouldn't work around such issues in the filesystem setup and should just fail. Otherwise there's a patch already attached which allowed portage to work for me even with such setup.
Comment 15 Zac Medico gentoo-dev 2017-03-08 08:22:31 UTC
I'm glad you found the root cause. I suppose we could have portage try to detect interference from ACLs, but somebody interested in that would have to submit a patch.