Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 551434

Summary: Portage should save all XATTRs into vdb
Product: Portage Development Reporter: Jason Zaman <perfinion>
Component: Enhancement/Feature RequestsAssignee: Portage team <dev-portage>
Status: CONFIRMED ---    
Severity: normal CC: hardened
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 193766    

Description Jason Zaman gentoo-dev 2015-06-07 10:56:15 UTC
Before portage merges in a package, it already goes through all the binaries to generate the NEEDED.ELF.2 data.
At the same time, portage should also save all the xattrs for all files. This would enable tools to read out the xattrs and re-apply them on packages without having to re-emerge the whole package.

This is related to GLEP64, https://wiki.gentoo.org/wiki/GLEP:64

This would be useful in several cases, some that come to mind
1) fcaps and PaX xt marks can be re-applied later on. These can go missing if things are moved around incorrectly. Or if users installed from a stage3 but did not specify --xattrs, they can fix it without having to start over.
2) SELinux has a restorecon tool to fix SELinux labels, this would enable a restorepax tool to be made too.

Lots of things use different xattrs so its probably better to just save everything except a blacklist so if new xattrs are added later they'll be automatically supported. The $PORTAGE_XATTR_IGNORE can be re-used as the blacklist.

Adding this as a new file in VDB would make backwards compatibility easy, I suggest a format with one file per line starting with the filename and then all the xattrs after it separated by space. 

path/to/file1 security.capabilities=AAAAAAA user.pax.flags="Em" user.test1="foo"
path/to/file2 user.pax.flags="M" user.test2="ba\"aar"


This is close but will output files even if they have no xattrs which is unnecessary, but its better off in python directly:
for FILE in `cd ${D}; find * -type f`;
do
  echo -n "${FILE}: "
  getfattr -d -m. "${FILE}" | sed '/^#/d' | tr '\n' ' '
done
Comment 1 Jason Zaman gentoo-dev 2015-06-07 11:17:47 UTC
If the user has selected elf PaX marks and not xattr marks, then this would not capture them. Since the elf and xattr marks are supposed to be the same, if the user does not have PAX_MARKINGS="XT" set, then the marks will need to be read with readelf and can be stored as if they were xattrs so we do not have to store them twice.
Comment 2 Arfrever Frehtes Taifersar Arahesis 2015-06-07 17:26:21 UTC
Format of values written in VDB needs to be discussed.

getfattr sometimes returns base-64-encoded string, while Python's os.getxattr() returns raw bytes.

$ getfattr -d -m- --absolute-names /bin/ping
# file: /bin/ping
security.capability=0sAQAAAgAgAAAAAAAAAAAAAAAAAAA=

$ python3.6 -c 'import os; print(os.getxattr("/bin/ping", "security.capability"))'
b'\x01\x00\x00\x02\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
$ python3.6 -c 'import base64, os; print(base64.b64encode(os.getxattr("/bin/ping", "security.capability")))'
b'AQAAAgAgAAAAAAAAAAAAAAAAAAA='
Comment 3 Jason Zaman gentoo-dev 2015-07-20 12:01:23 UTC
(In reply to Arfrever Frehtes Taifersar Arahesis from comment #2)
> Format of values written in VDB needs to be discussed.
> 
> getfattr sometimes returns base-64-encoded string, while Python's
> os.getxattr() returns raw bytes.

Thats a good point. 

From man getfattr:
-e en, --encoding=en
    Encode  values  after  retrieving them.  Valid values of en are "text",
    "hex", and "base64".  Values encoded as text strings  are  enclosed  in
    double  quotes ("), while strings encoded as hexidecimal and base64 are
    prefixed with 0x and 0s, respectively.

From man setfattr:
-v value, --value=value
    Specifies the new value of the  extended  attribute.  There  are  three
    methods  available  for  encoding  the  value.   If the given string is
    in double quotes, the inner string is treated as text. In that
    case,  backslashes  and double quotes have special meanings and need to
    be escaped by a preceding backslash.  Any  control  characters  can  be
    encoded  as  a  backslash followed by three digits as its ASCII code in
    octal. If the given string begins with 0x or 0X, it expresses  a  hexa‐
    decimal number. If the given string begins with 0s or 0S, base64 encod‐
    ing is expected.  See also the --encoding option of getfattr(1).


I think this is a reasonable encoding for the values. If printable, use "", otherwise use base64, that makes things like pax marks easy to read. 

I have written a shell hook in /etc/portage/hooks/install/ that goes through all the files and echos the xattrs (and uses PORTAGE_XATTR_EXCLUDE) into build-info/XATTRS. It works its just pretty ugly. I think writing it in python properly as part of portage would be much better. Portage has _xattr_excluder in pym/portage/util/movefile.py so the complicated part is already done.

A separate issue that I noticed, the fcaps.eclass methods are usually called from postinst() which means the capabilities dont show up in image/ and do not exist when the scanning is done. Using the hook works great for pax flags but not for caps :(. Setting caps may have to be moved to src_install or something instead?
Comment 4 SpanKY gentoo-dev 2015-11-24 23:20:06 UTC
(In reply to Jason Zaman from comment #3)

vdb format should be ASCII.  stuffing raw bytes is ugly and unfriendly.

the only reason fcaps is called from pkg_xxx is that there are no guarantees that xattrs will be transferred from $D to $ROOT.  if that is fixed, we should be able to adjust the callers.  we could even do this now by being smart: if in src_xxx, have fcaps save the requested ops in FILECAPS as well as try to apply them.  then the existing pkg hook will attempt to apply them if they don't exist already.