Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 586686 - sys-apps/portage[xattr] leaves vdb inconsistent when install-xattr fails to copy ACL entries
Summary: sys-apps/portage[xattr] leaves vdb inconsistent when install-xattr fails to c...
Status: UNCONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 193766
  Show dependency tree
 
Reported: 2016-06-22 09:26 UTC by Duncan
Modified: 2023-08-24 20:58 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info systemd (emerge.systemd.info,7.50 KB, text/plain)
2016-06-22 09:26 UTC, Duncan
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Duncan 2016-06-22 09:26:57 UTC
Created attachment 438408 [details]
emerge --info systemd

systemd-230-r2 has at least two issues that have left its files mostly or all  installed on my system, while -r1 remains recorded as installed.

a) Despite USE=-acl, a first attempt to install (or an install after having wiped the colliding files manually) produces this error:

--- /var/log/journal/remote/
!!! Failed to copy extended attributes. In order to avoid this error,
!!! set FEATURES="-xattr" in make.conf.
!!! copy /tmp/portage/sys-apps/systemd-230-r2/image/var/log/journal/remote/.keep_sys-apps_systemd-0 -> /var/log/journal/remote/.keep_sys-apps_systemd-0 failed.
!!! Filesystem containing file '/var/log/journal/remote/.keep_sys-apps_systemd-0#new' does not support extended attribute 'system.posix_acl_access

However, it wouldn't be _expected_ to support POSIX-ACL, since I have the acl USE flag toggled off, with the associated kernel config option for that filesystem type (CONFIG_BTRFS_FS_POSIX_ACL) also off.  But still it tries to set it and errors out when it can't. =:^(

b) The second problem is that the above error occurs *after* installing most or all of the files to the live filesystem, so -r2 is effectively merged, despite it erroring out, which keeps portage tracking -r1 as installed.

c) Of course this results in further problems when a second attempt is made, because now file-collision alerts on /usr/share/doc/systemd-230-r2/*, installed to the live filesystem by the aborted merge and thus now not owned by any package, as the merge aborted before it recorded -r2 as installed, despite the files already being on the live filesystem.

This of course leave the user puzzling over how systemd-230-r2 files can even be installed yet, since portage says systemd-230-r2 never merged and -r1 is still currently merged.

So two things need fixed:

1) The ebuild shouldn't attempt to set POSIX_ACLs when USE=-acl.

2) If there is a problem after the files are merged to the live filesystem, the ebuild needs to either roll back to the previous install or not abort with an error, so portage's installed-package database remains consistent with what's actually installed.  Leaving the files merged but aborting the package being reported as merged, so portage still says the old version is merged, doesn't work.

For #2 I'd suggest that a test file with the appropriate attribute should be merged to the target filesystem first, before the other package files.  If that fails, delete that file and error out without merging the other files to the the filesystem at all, thus avoiding an abort after merging the new version's files to the live filesystem and the resulting confusion.  (If the test attribute change works, don't forget to delete the test file in that case anyway, before going ahead with the normal live filesystem merge.)

(Setting FEATURES=-xattr in make.conf as suggested isn't appropriate, as normal filecaps are stored as extended attributes and I have it on so they get transferred properly.  But I don't need POSIX_ACLs nor do I have them enabled in the kernel, and I have USE=-acl set accordingly, so all ebuilds have to do is abide by that.)

sys-apps/portage-2.3.0_rc1-r1 and sys-apps/coreutils-8.25 are installed, if the first bug traces to them.  emerge --info systemd attached.

(As a temporary workaround I added an entry to /etc/portage/package.env that points ~sys-apps/systemd-230 at a file in /etc/portage/env/ that simply does FEATURES=-xattr for that specific package atom to address #1, and manually rmed the unowned -r2 doc files to address #2, and -r2 installed fine after that.  But that doesn't fix the problems, only mask the symptoms long enough to get the package installed.)

(systemd@ CCed in accordance with equery meta systemd's maintainer output, since I can't assign.)
Comment 1 Mike Gilbert gentoo-dev 2016-06-22 11:32:21 UTC
I am assigning this to the portage team since the major problem seems to be handling of a merge to a filesystem that does not have ACL support enabled. That should not leave the package in an inconsistent state.

As for the first issue you reported, that is an issue with upstream's build system, and there is an open pull request to resolve it.

https://github.com/systemd/systemd/pull/3366
Comment 2 Zac Medico gentoo-dev 2016-06-22 16:23:28 UTC
This is very close to bug 584760, except this failing attribute is system.posix_acl_access instead of user.pax.flags.
Comment 3 Duncan 2016-06-23 03:29:15 UTC
(In reply to Zac Medico from comment #2)
> This is very close to bug 584760, except this failing attribute is
> system.posix_acl_access instead of user.pax.flags.

Indeed.  It seems to be a different instance of the exact same bug.  If you wish to call it a dup, go ahead, but I'm guessing the fact that you didn't indicates you have reason not to.

One way to potentially "fix" the problem would be to do an automatic rollback of sorts.  For this, you'd probably need to consider the package installed and update the database accordingly, then depending on whether it replaced an earlier version of the same package or was a new install, either remerge the old version of the package (preferably checking to see if there's a binpkg first and using it if so, at least if FEATURES=binpkg so there likely is) normally, or simply unmerge the new package normally.

That way the database should agree with what's actually installed, and with the assumption that the old package can be remerged (which should be valid  if it was a change in package behavior but might not be if it was a kernel option change or the like), the result should be as if the new version merge attempt wasn't tried at all.

Of course that leaves the problem of what to do if the old version doesn't merge successfully either.  I guess in that case you just pretend it did, update the database with what did install, and leave it at that.  At least then the database should be correct, even if the package as installed is ultimately broken.

Of course there should be quite some complaints printed during all this saying what exactly is going on, probably updating the final status to fail as well and canceling reverse dependency merges that would have been installed later, so the thing hopefully isn't taken as a successful merge.

Practical at all and if so, anything like that suggested for an EAPI?  Where has the discussion gone on it if so?  (A link to the discussion or a list archve along with keywords to search for in the latter case would be fine.)


The other solution, per-ebuild as I suggested in comment #0, would be a test-file install setup with the the target xattribs.  If it works, assume the xattribs will be handled directly.  Delete the test file and abort the merge otherwise.  Perhaps handled by an eclass to centralize standard handling, with a variable to set with the attributes to check on the live filesystem testfile to see if the test is successful or not (missing the appropriate tools to do the check would be an auto fail and abort).  But of course this being a portage bug now, it's not really the appropriate place to discuss this potential solution in detail as it wouldn't be a portage fix.