eix-update generates new database files, and then opens the existing path with truncate to replace. If another process attempts to read the database while the write is happening, it will have weird errors. If the write fails (eg out of disk space), the on-disk database will also remain corrupted. Please fix this by writing new databases to a separate filename in the same directory (one convention is appending .$PID on the original filename), and moving it over the new file if the write was successful. This manages to bite Gentoo infra's puppet repos a few times a week, because it relies heavily on eix for lookups, and portage is concurrently updated in a cronjob very often.
(In reply to Robin Johnson from comment #0) > eix-update [...] opens the existing path with truncate to replace This is intentional: This behaviour ensures that the permissions given by the user to the database (including xattr, acl, or whatever) are preserved and that there are no problems if eix-update has no permissions to write into the corresponding directory. > Please fix this by writing new databases to a separate filename > in the same directory (one convention is appending .$PID on the original > filename), and moving it over the new file if the write was successful. Depending on the directory in which EIX_CACHEFILE resides, this might be possibly a security risk (generating a predictable filename in a directory which possibly is writable by many people). If you want this behaviour, you can simply call a wrapper script instead of eix-update (while the converse could not easily be obtained by a wrapper script if eix-update would behave differently), for instance as follows (untested): #!/bin/sh . eix-functions.sh ReadFunctions ReadVar EIX_CACHEFILE if EIX_CACHEFILE=$EIX_CACHEFILE.$$ /usr/bin/eix-update "$@" then mv -- "$EIX_CACHEFILE.$$" "$EIX_CACHEFILE" || \ die "failed to move $EIX_CACHEFILE.$$" else rm -f -- "$EIX_CACHEFILE.$$" || \ die "failed to remove $EIX_CACHEFILE.$$" die "failed: eix-update $*" fi
(In reply to Martin Väth from comment #1) > ReadVar EIX_CACHEFILE This should be ReadVar EIX_CACHEFILE EIX_CACHEFILE /usr/bin/eix-update (the second argument was simply forgotten, and by giving the third an endless loop is avoided if the wrapper script is e.g. /usr/local/bin/eix-update). However, I repeat: The script is untested.
...and yet another small issue with the script I just realized: "die" would also call ReadVar, but without a third argument. Thus, one should set read_var_prg=/usr/bin/eix-update near the beginning of the script to use the full path throughout.
Iif you don't intend on atomically replacing, then can you please use flock(2) on the database, to prevent a concurrent open for reading while the process is busy writing the new content? When opening for reading, do: if(flock(fd, LOCK_SH) != 0) { // handle err } ... flock(fd, LOCK_UN); When opening for writing, do: if(flock(fd, LOCK_EX) != 0) { // handle err } ... flock(fd, LOCK_UN); Crashes will also automatically remove the lock, or closing the descriptors. flock will block as needed to prevent a read/write conflict. If you want non-blocking, or it with LOCK_NB, and check EWOULDBLOCK
Good idea. This is fixed in eix github master branch (>=eix-0.30.12)-