Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 602986 - Sort portage /var/lib/portage/config file before writing to disc.
Summary: Sort portage /var/lib/portage/config file before writing to disc.
Status: UNCONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Enhancement/Feature Requests (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-12-18 08:50 UTC by pitcha
Modified: 2019-09-02 18:13 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
Sort dictionary before writing to disc (portage-sort-dictionary-before-write.patch,570 bytes, patch)
2016-12-18 08:50 UTC, pitcha
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description pitcha 2016-12-18 08:50:23 UTC
Created attachment 456552 [details, diff]
Sort dictionary before writing to disc

I have my configuration files under version control. Among the tracked files is portage's /var/lib/portage/config (if that's a good idea to track is a different story, but currently I do).
Over the time I noticed that the file is changed by portage in a non deterministic way, e.g. emerging openssl and then reemering it yielded different /var/lib/portage/config files.
The reason is that the file /var/lib/portage/config is just a python dictionary written to disc and dictionaries are unordered (pym/portage/util/__init__.py, line 577)

def writedict(mydict, myfilename, writekey=True):
        """Writes out a dict to a file; writekey=0 mode doesn't write out
        the key and assumes all values are strings, not lists.""" 
        lines = []
        if not writekey:
                for v in mydict.values():
                        lines.append(v + "\n")
        else:
                for k, v in mydict.items():
                        lines.append("%s %s\n" % (k, " ".join(v)))
        write_atomic(myfilename, "".join(lines))

I propose to change the implementation to write out the dictionary in alphabetical order (of the key if writekey=True, or the value otherwise).
This makes the implementation more deterministic over different runs.

This is a small change, just inserting sorted() in each of the two for-loops (patch also attached), e.g.

        [...]
        if not writekey:
                for v in sorted(mydict.values()):
                        lines.append(v + "\n")
        else:
                for k, v in sorted(mydict.items()):
                        lines.append("%s %s\n" % (k, " ".join(v)))
        [...]
Comment 1 Zac Medico gentoo-dev 2016-12-18 10:25:35 UTC
Sorting this file is a waste of cpu time, and really we should take every opportunity that we can to conserve resources.

I think we can eliminate /var/lib/portage/config entirely, and instead rely upon the md5 digest of the config file installed by the previous instance the package (the md5 digests are available in /var/db/pkg/*/*/CONTENTS).
Comment 2 Zac Medico gentoo-dev 2019-09-02 18:13:36 UTC
(In reply to Zac Medico from comment #1)
> I think we can eliminate /var/lib/portage/config entirely, and instead rely
> upon the md5 digest of the config file installed by the previous instance
> the package (the md5 digests are available in /var/db/pkg/*/*/CONTENTS).

Since file collisions are allowed for files under CONFIG_PROTECT, it's possible for multiple packages to "own" a particular config file, so /var/lib/portage/config currently provides disambiguation in this case.