Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 28228 - patch for file-collision protection
Summary: patch for file-collision protection
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
: 2857 17437 39911 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-09-08 17:09 UTC by Marius Mauch (RETIRED)
Modified: 2004-08-16 10:45 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
patch to prevent file collisions (file-collision-protect.patch,2.52 KB, patch)
2003-09-08 17:10 UTC, Marius Mauch (RETIRED)
Details | Diff
patch with improved performance and status messages (file-collision-protect.patch,3.61 KB, patch)
2003-09-09 04:08 UTC, Marius Mauch (RETIRED)
Details | Diff
made the unmerging conditional on package contents (file-collision-protect.patch,3.70 KB, patch)
2003-09-10 23:05 UTC, Marius Mauch (RETIRED)
Details | Diff
patch without lambda and popen() (file-collision-protect.patch,3.52 KB, patch)
2003-12-08 15:41 UTC, Marius Mauch (RETIRED)
Details | Diff
fixed patch (file-collision.patch,3.71 KB, patch)
2003-12-09 13:09 UTC, Marius Mauch (RETIRED)
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Marius Mauch (RETIRED) gentoo-dev 2003-09-08 17:09:37 UTC
The attached patch will prevent portage merging packages that would overwrite
files from other packages (like the current /bin/kill issue). It's just a
proof-of-concept by now and needs some more work.
BTW, it would benefit from python-2.3
Comment 1 Marius Mauch (RETIRED) gentoo-dev 2003-09-08 17:10:52 UTC
Created attachment 17298 [details, diff]
patch to prevent file collisions

Patch against portage-2.0.49. Has some ugly parts in it.
Comment 2 Marius Mauch (RETIRED) gentoo-dev 2003-09-09 04:08:44 UTC
Created attachment 17320 [details, diff]
patch with improved performance and status messages

The old patch was painfully slow on big packages and didn't inform the user
about progress, changed that so that every 1000 checked files a status message
is printed.
Comment 3 Nicholas Jones (RETIRED) gentoo-dev 2003-09-09 21:07:25 UTC
Invoking an unmerge is very dangerous.
What happens if python is the package with the collision?
Comment 4 Marius Mauch (RETIRED) gentoo-dev 2003-09-09 21:22:18 UTC
I know and I really don't want that, but at this point the package is already merged in the db for some reason and won't get caught by future emerge runs. As I said it's ugly and needs more work. Any suggestions how to avoid the unmerge ?
Comment 5 Marius Mauch (RETIRED) gentoo-dev 2003-09-10 23:05:30 UTC
Created attachment 17482 [details, diff]
made the unmerging conditional on package contents

Ok, I disarmed the unmerge call by checking if self.getcontents() is empty
which should only be true if the package hasn't been merged on the filesystem.
Do you think it is safe enough that way ?
Comment 6 Marius Mauch (RETIRED) gentoo-dev 2003-09-21 06:33:12 UTC
*** Bug 2857 has been marked as a duplicate of this bug. ***
Comment 7 Marius Mauch (RETIRED) gentoo-dev 2003-09-21 09:13:26 UTC
*** Bug 17437 has been marked as a duplicate of this bug. ***
Comment 8 Marius Mauch (RETIRED) gentoo-dev 2003-11-22 19:33:41 UTC
I leave it to you if make it default or not.
Comment 9 Marius Mauch (RETIRED) gentoo-dev 2003-12-08 15:41:15 UTC
Created attachment 21898 [details, diff]
patch without lambda and popen()

this patch doesn't use list-constructors or lamda functions and replaces
the popen call to find with a call to listdir().
Comment 10 Marius Mauch (RETIRED) gentoo-dev 2003-12-09 13:09:45 UTC
Created attachment 21942 [details, diff]
fixed patch

there were a few problems in the last patch, should be fixed now.
Comment 11 Marius Mauch (RETIRED) gentoo-dev 2004-02-15 17:27:15 UTC
anything needed to be done before this can be included ?
Comment 12 Nicholas Jones (RETIRED) gentoo-dev 2004-04-11 15:46:31 UTC
*** Bug 39911 has been marked as a duplicate of this bug. ***
Comment 13 Brian Harring (RETIRED) gentoo-dev 2004-06-23 02:47:39 UTC
Few major issues w/ this patch (moreso exposed by this patch).
Assuming I'm reading it correctly, if a file would be overwritten, and isn't owned by any versions of this package, it's marked as colliding, and the merge fails.

Ok, fair enough.  Problem is the current stage1 tarballs have portions of their /var/db/pkg/* entries removed, to force portage into thinking it has to re-emerge all but a handful of packages.

This feature can't be enabled till the stage1 issues have been fixed- as is, the stage1 method is a bit of a kludge, so it needs fixing anyways.
Comment 14 Cameron Blackwood 2004-06-23 09:38:04 UTC
If I could just randomly post some ideas for a short while (its 2:35 am, so excuse this if it makes no sense ;)

Could portage borrow a page from rpm's behaviour when merging / upgrading files?

Something that dealt with colliding files and those annoying config files where only the CVS id changes would _so_ make gentoo better. (duh, I hate those).

Now, Im a newbie and I havnt looked at how emerge actually installs files, but Im assuming that it does (or would be possible to make it) install into a chrooted tree _before_ installing them with all the live files, to get the 'new' files without actually copying them in.

If that is the case then you could run something like rpm's behaviour over them. I would suggest something like this:

 if the file already exists:
    if this is an upgrade:
      if the file _isnt_ in the old version: 
        abort (1)
      else if the file _is_ in the old version:
        if the new file crc/size == the old file crc/size:
           leave it (its the same file after all :)
        else if the old file crc + size matches the old version:
           overwrite the old version with the new (old file not changed) (2)
        else if the old file crc or size _doesnt_ match the old version:
           abort (user changed it already) (3)
 else if the file doesnt already exist:
     install the file (4)

(1) it belongs to something else: abort
(2) the file belongs to the old package and _hasnt_ been changed: overwrite it
(3) the file belongs to the old package and has been changed: abort (it might be an edited config file)
(4) its a new file: install it. 

Personally, Id rather see this happen on _all_ files (not just /etc config files). I dont want the system overwriting my files without warning. Thats _bad_ im by book (and why I submitted my original bug report ;).

If you could compare the old file with its old package crc then you could drastically reduce the amount of #&$*#$ing around that people need to do with /etc config files. If the user hasnt changed it, then why not overwrite it?

For example in my current /etc I have:

diff ._cfg0000_cupsd cupsd 
2,4c2,4
< # Copyright 1999-2004 Gentoo Technologies, Inc.
< # Distributed under the terms of the GNU General Public License v2
< # $Header: /home/cvsroot/gentoo-x86/net-print/cups/files/cupsd.rc6,v 1.14 2004/03/06 03:45:46 vapier Exp $
---
> # Copyright 1999-2003 Gentoo Technologies, Inc.
> # Distributed under the terms of the GNU General Public License, v2 or later
> # $Header: /home/cvsroot/gentoo-x86/net-print/cups/files/cupsd.rc6,v 1.13 2003/11/05 20:08:43 lanius Exp $

I didnt change this config file, so wouldnt it be nice if it just replaced it automatically? It hadnt been changed, so if I reinstalled the old version it should be recreated again exactly, so why not replace it?

There is enough info in the old package data and the new package data to work out that I didnt change it, why not install over it.

Oh and when I wrote 'abort' above, that doesnt mean that you couldnt have an addition to emerge that supported something like:

  emerge --list_blocking_files package_name

to list the path of the old file and the new file (in the chroot directory where it was installed) so the user could go through and work out what the differences were. (Just like they need to walk through /etc looking for ._cfg* files now).

Then, when the user was happy that the emerge sould continue, they could run something like:

  emerge --force_install package_name

to force an install of the files from the chroot install directory to the actual file system (without recompiling ;).

All this assumes that ebuild/emerge can actually get a list/copy of the files that its going to install before it actually copies them. If it cant then just ignore me and point and call me a clueless newbie :) :) :).

[ Actually I really love gentoo's portage idea. Im very worried that it can overwrite files without warning and the /etc config problems like the one above just annoy me, but other than those small problems you've got the best package system ever!!!! no more rpm dependancy failure.. yaaaaaaaaay!!!! ]
Comment 15 Marius Mauch (RETIRED) gentoo-dev 2004-06-23 10:16:28 UTC
Brian: I don't have the intention to enable it by default, it's only intended for developer usage atm (so we can fix most of the "broken" packages).
Cameron: This patch deals with your suggestion for general files, it doesn't affect CONFIG_PROTECTed files at all (that has to be changed in etc-update/dispatch-conf/...)
Comment 16 Brian Harring (RETIRED) gentoo-dev 2004-06-23 13:50:16 UTC
genone, take a look at http://dev.gentoo.org/~ferringb/contents_db.tar.bz2

The collision-protect patch is unable to catch files that already have a refcount greater then 1- some code I've been futzing with in the uri above is able to.

Already identified around 70 misbehaving packages via it.  At some point, I'd like to actually move to a centralized contents db, which is what this code is a  start of.

Note the src is beta, won't eat hd's, but may throw an exception...
Comment 17 Brian Harring (RETIRED) gentoo-dev 2004-08-16 10:45:02 UTC
been released for a while...