Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 690484 - Detection of internal collisions (between files in separate directories in the installation image (${D}) corresponding to merged directories in the target filesystem (${ROOT}))
Summary: Detection of internal collisions (between files in separate directories in th...
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All All
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords: InVCS
: 628652 (view as bug list)
Depends on:
Blocks: usrmerge-fixes 691278
  Show dependency tree
 
Reported: 2019-07-23 06:21 UTC by Arfrever Frehtes Taifersar Arahesis
Modified: 2019-11-03 22:39 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch (portage-internal_collisions_detection.patch,4.90 KB, patch)
2019-07-23 06:38 UTC, Arfrever Frehtes Taifersar Arahesis
Details | Diff
internal_collisions_test-0.ebuild (internal_collisions_test-0.ebuild,2.81 KB, text/plain)
2019-07-23 06:43 UTC, Arfrever Frehtes Taifersar Arahesis
Details
Patch (portage-internal_collisions_detection.patch,4.90 KB, patch)
2019-07-23 07:08 UTC, Arfrever Frehtes Taifersar Arahesis
Details | Diff
Patch (portage-internal_collisions_detection.patch,8.05 KB, patch)
2019-08-07 17:15 UTC, Arfrever Frehtes Taifersar Arahesis
Details | Diff
internal_collisions_test-0.ebuild (internal_collisions_test-0.ebuild,3.76 KB, text/plain)
2019-08-10 12:24 UTC, Arfrever Frehtes Taifersar Arahesis
Details
Patch (portage-internal_collisions_detection.patch,9.10 KB, patch)
2019-08-11 00:59 UTC, Arfrever Frehtes Taifersar Arahesis
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Arfrever Frehtes Taifersar Arahesis 2019-07-23 06:21:57 UTC
Portage should detect internal collisions (between files of the same package), which can occur when some directories in ${ROOT} have been replaced with symbolic links to other directories, while directories in ${D} are separate.

Example:
  /bin as real directory
  /sbin, /usr/bin and /usr/sbin as symbolic links to /bin
  A package which installs an executable in one of aforementioned directories and a symbolic link to this executable in another directory


Besides reporting internal collisions, Portage could automatically resolve selected subset of internal collisions when it is safe to do so.

The most common such situation is collision of a symbolic link with its target.
It is safe to drop such symbolic link and keep only its target.

(Some other situations are very rare or not observed to occur in practice, so I have not implemented their automatic resolution.)
Comment 1 Arfrever Frehtes Taifersar Arahesis 2019-07-23 06:38:13 UTC
Created attachment 584146 [details, diff]
Patch
Comment 2 Arfrever Frehtes Taifersar Arahesis 2019-07-23 06:43:53 UTC
Created attachment 584148 [details]
internal_collisions_test-0.ebuild

Ebuild for easy testing of various variants
Comment 3 Arfrever Frehtes Taifersar Arahesis 2019-07-23 06:50:38 UTC
Output WITHOUT patch:

--- /usr/
--- /usr/x1/
>>> /usr/x1/\ufffd
>>> /usr/x1/a
>>> /usr/x1/b
>>> /usr/x1/c -> /dev/zero
>>> /usr/x1/d -> /dev/zero
>>> /usr/x1/e
>>> /usr/x1/f
>>> /usr/x1/g
>>> /usr/x1/h
>>> /usr/x1/i
>>> /usr/x1/あいうえお
>>> /usr/x1/アイウエオ
--- /usr/x2/
>>> /usr/x2/\ufffd -> ../x1/\ufffd
>>> /usr/x2/a
>>> /usr/x2/b
>>> /usr/x2/c -> /dev/null
>>> /usr/x2/d -> /dev/zero
>>> /usr/x2/e -> ../x1/e
>>> /usr/x2/f -> /usr/x1/f
>>> /usr/x2/g -> /dev/random
>>> /usr/x2/h
>>> /usr/x2/i
>>> /usr/x2/あいうえお -> ../x1/あいうえお
>>> /usr/x2/アイウエオ
--- /usr/x3/
>>> /usr/x3/a
>>> /usr/x3/b
>>> /usr/x3/c -> /dev/full
>>> /usr/x3/d -> /dev/zero
>>> /usr/x3/g -> /dev/urandom
>>> /usr/x3/h -> ../x1/h
>>> /usr/x3/i -> /dev/mem
>>> /usr/x3/アイウエオ -> ../x1/アイウエオ
--- /usr/x4/
>>> /usr/x4/a
>>> /usr/x4/b
>>> /usr/x4/c -> /dev/random
>>> /usr/x4/d -> /dev/zero
>>> /usr/x4/g -> /dev/tty
>>> /usr/x4/i -> /dev/port
>>> /usr/x3/\ufffd -> /usr/x1/\ufffd
>>> /usr/x3/e -> ../bin/../x1/e
>>> /usr/x3/f -> /usr/bin/../x1/f
>>> /usr/x3/あいうえお -> /usr/x1/あいうえお
>>> /usr/x4/\ufffd -> /usr/bin/../x1/\ufffd
>>> /usr/x4/e -> ../bin/../sbin/../x1/e
>>> /usr/x4/f -> /usr/../usr/bin/../x1/f
>>> /usr/x4/h -> /usr/x2/h
>>> /usr/x4/あいうえお -> /usr/bin/../x1/あいうえお
>>> /usr/x4/アイウエオ -> /usr/bin/../x2/アイウエオ
>>> app-misc/internal_collisions_test-0 merged.


$ ls -Fhl /usr/x1
total 8.0K
lrwxrwxrwx 1 root root 21 Jul 23 08:40 '\ufffd' -> '/usr/bin/../x1/\ufffd'
-rw-r--r-- 1 root root  2 Jul 23 08:39  a
-rw-r--r-- 1 root root  2 Jul 23 08:39  b
lrwxrwxrwx 1 root root 11 Jul 23 08:39  c -> /dev/random
lrwxrwxrwx 1 root root  9 Jul 23 08:39  d -> /dev/zero
lrwxrwxrwx 1 root root 22 Jul 23 08:39  e -> ../bin/../sbin/../x1/e
lrwxrwxrwx 1 root root 23 Jul 23 08:39  f -> /usr/../usr/bin/../x1/f
lrwxrwxrwx 1 root root  8 Jul 23 08:39  g -> /dev/tty
lrwxrwxrwx 1 root root  9 Jul 23 08:39  h -> /usr/x2/h
lrwxrwxrwx 1 root root  9 Jul 23 08:39  i -> /dev/port
lrwxrwxrwx 1 root root 30 Jul 23 08:39  あいうえお -> /usr/bin/../x1/あいうえお
lrwxrwxrwx 1 root root 30 Jul 23 08:39  アイウエオ -> /usr/bin/../x2/アイウエオ
$ ls -Fhl /usr/x2
lrwxrwxrwx 1 root root 2 Jul 23 08:39 /usr/x2 -> x1/
$ ls -Fhl /usr/x3
lrwxrwxrwx 1 root root 2 Jul 23 08:39 /usr/x3 -> x1/
$ ls -Fhl /usr/x4
lrwxrwxrwx 1 root root 2 Jul 23 08:39 /usr/x4 -> x1/
$ cat /usr/x1/a
4
$ 


Above output shows various aberrations occurring when internal collisions are not detected early.
Comment 4 Arfrever Frehtes Taifersar Arahesis 2019-07-23 07:01:07 UTC
Output WITH patch:

 * Package 'app-misc/internal_collisions_test-0' has internal collisions
 * (between files in separate directories in the installation image
 * (${D}) corresponding to merged directories in the target filesystem
 * (${ROOT})).
 * 
 *      /usr/x1/\ufffd
 *              /usr/x1/\ufffd
 *              /usr/x2/\ufffd
 *              /usr/x3/\ufffd
 *              /usr/x4/\ufffd
 * 
 *      /usr/x1/a
 *              /usr/x1/a
 *              /usr/x2/a
 *              /usr/x3/a
 *              /usr/x4/a
 * 
 *      /usr/x1/b
 *              /usr/x1/b
 *              /usr/x2/b
 *              /usr/x3/b
 *              /usr/x4/b
 * 
 *      /usr/x1/c
 *              /usr/x1/c
 *              /usr/x2/c
 *              /usr/x3/c
 *              /usr/x4/c
 * 
 *      /usr/x1/d
 *              /usr/x1/d
 *              /usr/x2/d
 *              /usr/x3/d
 *              /usr/x4/d
 * 
 *      /usr/x1/g
 *              /usr/x1/g
 *              /usr/x2/g
 *              /usr/x3/g
 *              /usr/x4/g
 * 
 *      /usr/x1/h
 *              /usr/x1/h
 *              /usr/x2/h
 * 
 *      /usr/x1/i
 *              /usr/x1/i
 *              /usr/x2/i
 *              /usr/x3/i
 *              /usr/x4/i
 * 
 *      /usr/x1/アイウエオ
 *              /usr/x1/アイウエオ
 *              /usr/x2/アイウエオ
 * 
 * Package 'app-misc/internal_collisions_test-0' NOT merged due to
 * internal collisions. If necessary, refer to your elog messages for the
 * whole content of the above message.


$ ls -Fhl /var/tmp/portage/app-misc/internal_collisions_test-0/image/usr/x1
total 40K
-rw-r--r-- 1 root root 2 Jul 23 08:58 '\ufffd'
-rw-r--r-- 1 root root 2 Jul 23 08:58  a
-rw-r--r-- 1 root root 2 Jul 23 08:58  b
lrwxrwxrwx 1 root root 9 Jul 23 08:58  c -> /dev/zero
lrwxrwxrwx 1 root root 9 Jul 23 08:58  d -> /dev/zero
-rw-r--r-- 1 root root 2 Jul 23 08:58  e
-rw-r--r-- 1 root root 2 Jul 23 08:58  f
-rw-r--r-- 1 root root 2 Jul 23 08:58  g
-rw-r--r-- 1 root root 2 Jul 23 08:58  h
-rw-r--r-- 1 root root 2 Jul 23 08:58  i
-rw-r--r-- 1 root root 2 Jul 23 08:58  あいうえお
-rw-r--r-- 1 root root 2 Jul 23 08:58  アイウエオ
$ ls -Fhl /var/tmp/portage/app-misc/internal_collisions_test-0/image/usr/x2
total 20K
lrwxrwxrwx 1 root root  7 Jul 23 08:58 '\ufffd' -> '../x1/'$'\200'
-rw-r--r-- 1 root root  2 Jul 23 08:58  a
-rw-r--r-- 1 root root  2 Jul 23 08:58  b
lrwxrwxrwx 1 root root  9 Jul 23 08:58  c -> /dev/null
lrwxrwxrwx 1 root root  9 Jul 23 08:58  d -> /dev/zero
lrwxrwxrwx 1 root root 11 Jul 23 08:58  g -> /dev/random
-rw-r--r-- 1 root root  2 Jul 23 08:58  h
-rw-r--r-- 1 root root  2 Jul 23 08:58  i
-rw-r--r-- 1 root root  2 Jul 23 08:58  アイウエオ
$ ls -Fhl /var/tmp/portage/app-misc/internal_collisions_test-0/image/usr/x3
total 8.0K
lrwxrwxrwx 1 root root  9 Jul 23 08:58 '\ufffd' -> '/usr/x1/'$'\200'
-rw-r--r-- 1 root root  2 Jul 23 08:58  a
-rw-r--r-- 1 root root  2 Jul 23 08:58  b
lrwxrwxrwx 1 root root  9 Jul 23 08:58  c -> /dev/full
lrwxrwxrwx 1 root root  9 Jul 23 08:58  d -> /dev/zero
lrwxrwxrwx 1 root root 12 Jul 23 08:58  g -> /dev/urandom
lrwxrwxrwx 1 root root  8 Jul 23 08:58  i -> /dev/mem
$ ls -Fhl /var/tmp/portage/app-misc/internal_collisions_test-0/image/usr/x4
total 8.0K
lrwxrwxrwx 1 root root 16 Jul 23 08:58 '\ufffd' -> '/usr/bin/../x1/'$'\200'
-rw-r--r-- 1 root root  2 Jul 23 08:58  a
-rw-r--r-- 1 root root  2 Jul 23 08:58  b
lrwxrwxrwx 1 root root 11 Jul 23 08:58  c -> /dev/random
lrwxrwxrwx 1 root root  9 Jul 23 08:58  d -> /dev/zero
lrwxrwxrwx 1 root root  8 Jul 23 08:58  g -> /dev/tty
lrwxrwxrwx 1 root root  9 Jul 23 08:58  i -> /dev/port
$ 


Above output shows that all collisions for "e", "f" and "あいうえお", and half of collisions for "h" and "アイウエオ" have been automatically resolved.
(Automatic resolution involving non-UTF-8 filenames (e.g. "\x80" mangled to "\ufffd") does not work due to bug #690480.)
Comment 5 Arfrever Frehtes Taifersar Arahesis 2019-07-23 07:08:53 UTC
Created attachment 584152 [details, diff]
Patch

(Changed "." to ":" in error message)
Comment 6 Matthias Maier gentoo-dev 2019-07-23 09:54:33 UTC
I highly welcome this patch!

Handling internal collision (or more precisely guarding against internal collisions) as done by this patch is an essential feature that portage should be capable of doing (independent on anyones stance on "merged usr").
Comment 7 Zac Medico gentoo-dev 2019-08-01 20:55:54 UTC
(In reply to Arfrever Frehtes Taifersar Arahesis from comment #5)
> Created attachment 584152 [details, diff] [details, diff]
> Patch

It looks like if a symlink is added to real_relative_paths before its target, then the desired compare/delete will not occur (it only works if the target is added to real_relative_paths first).
Comment 8 Arfrever Frehtes Taifersar Arahesis 2019-08-01 22:40:17 UTC
(In reply to Zac Medico from comment #7)
> It looks like if a symlink is added to real_relative_paths before its
> target, then the desired compare/delete will not occur (it only works if the
> target is added to real_relative_paths first).

I also thought about this situation when writing this patch, and this situation actually never occurs, because the main loop in _collision_protect() firstly iterates on regular files before symlinks:

for i, (f, f_type) in enumerate(chain(
	((f, "reg") for f in file_list),
	((f, "sym") for f in symlink_list))):


file_list contains only regular files.
symlink_list contains only symlinks.
Comment 9 Zac Medico gentoo-dev 2019-08-01 22:53:29 UTC
(In reply to Arfrever Frehtes Taifersar Arahesis from comment #8)
> (In reply to Zac Medico from comment #7)
> > It looks like if a symlink is added to real_relative_paths before its
> > target, then the desired compare/delete will not occur (it only works if the
> > target is added to real_relative_paths first).
> 
> I also thought about this situation when writing this patch, and this
> situation actually never occurs, because the main loop in
> _collision_protect() firstly iterates on regular files before symlinks:
> 
> for i, (f, f_type) in enumerate(chain(
> 	((f, "reg") for f in file_list),
> 	((f, "sym") for f in symlink_list))):
> 
> 
> file_list contains only regular files.
> symlink_list contains only symlinks.

Ok, we can add a comment make the assumption more obvious.
Comment 10 Zac Medico gentoo-dev 2019-08-01 23:16:38 UTC
The collision detection is surely harmless, but the symlink collision resolution could be a violation of PMS. Maybe we should also resolve collisions between two files that are hardlinked to each other?

CCing pms@gentoo.org for the collision resolution logic.
Comment 11 Arfrever Frehtes Taifersar Arahesis 2019-08-02 01:11:44 UTC
(In reply to Zac Medico from comment #9)

I will add a comment at the top of the main loop.


(In reply to Zac Medico from comment #10)
> Maybe we should also resolve collisions between
> two files that are hardlinked to each other?

"Resolving" such collisions would probably mean not detecting them and doing nothing special.

Should both paths be registered in CONTENTS in VDB?

I suppose that it might be also appropriate to silently ignore in this way a collision between separate files (not hardlinked) in ${D} which match all of the following conditions:
- The same type
- The same owner, group, permissions
- The same extended attributes
- The same size
- The same contents
  (For symlinks: The same target)
Comment 12 Arfrever Frehtes Taifersar Arahesis 2019-08-07 17:15:01 UTC
Created attachment 586062 [details, diff]
Patch

Updated patch.
Collisions between identical files (including hardlinked files) are silently ignored.
Code for dropping of symbolic links colliding with their targets not included until potential further clarification.
Comment 13 Zac Medico gentoo-dev 2019-08-10 04:21:46 UTC
The latest patch does not attempt to perform collisions resolution, so it will not violate PMS.
Comment 14 Arfrever Frehtes Taifersar Arahesis 2019-08-10 12:24:47 UTC
Created attachment 586490 [details]
internal_collisions_test-0.ebuild

Updated ebuild for testing

Most internal collisions are not generated with USE="-non-ignorable_internal_collisions", which allows to see that internal collisions between identical files are ignored.
Comment 15 Zac Medico gentoo-dev 2019-08-10 20:52:43 UTC
(In reply to Arfrever Frehtes Taifersar Arahesis from comment #14)
> Created attachment 586490 [details]
> internal_collisions_test-0.ebuild

It looks like internal_collisions will only contain differences for adjacent files in the list? I guess that will suffice in practice, since it will show at least one set of differences.

For unicode safety, the compare_files function needs to use _unicode_encode for paths in lstat an open calls.

The time_type filtering is kind of ugly. Maybe remove the time comparisons from the compare_files function itself. Let's make compare_files a private function so that there's no guarantee of API stability.
Comment 16 Arfrever Frehtes Taifersar Arahesis 2019-08-11 00:59:18 UTC
Created attachment 586520 [details, diff]
Patch
Comment 17 Larry the Git Cow gentoo-dev 2019-08-11 19:12:55 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/proj/portage.git/commit/?id=0b7eda500a0dcb98a67f33bf9ef25b202b358986

commit 0b7eda500a0dcb98a67f33bf9ef25b202b358986
Author:     Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org>
AuthorDate: 2019-08-07 17:06:11 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2019-08-11 19:11:47 +0000

    dblink._collision_protect: Detect internal collisions.
    
    Implement detection of internal collisions (between files of the same package,
    located in separate directories in the installation image (${D}) corresponding
    to merged directories in the target filesystem (${ROOT})).
    
    This provides protection against overwriting some files when performing merging
    of files from ${D} to ${ROOT} in some filesystem layouts (such as /-merged layout
    or /usr-merged layout).
    
    Internal collisions between identical files are silently ignored.
    
    Bug: https://bugs.gentoo.org/690484
    Signed-off-by: Arfrever Frehtes Taifersar Arahesis <Arfrever@Apache.Org>
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 lib/portage/dbapi/vartree.py       |  55 ++++++++++++++++++--
 lib/portage/util/_compare_files.py | 103 +++++++++++++++++++++++++++++++++++++
 2 files changed, 154 insertions(+), 4 deletions(-)
Comment 18 Larry the Git Cow gentoo-dev 2019-08-19 05:06:33 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=08557524dc6c8eec3a366e43ab2587d2cdd8f133

commit 08557524dc6c8eec3a366e43ab2587d2cdd8f133
Author:     Zac Medico <zmedico@gentoo.org>
AuthorDate: 2019-08-19 04:24:07 +0000
Commit:     Zac Medico <zmedico@gentoo.org>
CommitDate: 2019-08-19 05:06:15 +0000

    sys-apps/portage: Bump to version 2.3.72
    
     #463952 glsa-check: install in /usr/bin
     #646090 preserve-libs: get dep graph from EROOT
     #690484 detect internal collisions for /usr merge
     #690786 repoman: support metadata/layout.conf restrict-allowed
     #691776 unpack: Unconditionally die if an unpacker returns an error
     #691638 Show get/setfattr stderr
     #692024 econf: Unconditionally die on error in EAPIs 0 to 3
     #692262 QA Notice: EXPORT_FUNCTIONS is called before inherit in
             kernel-2.eclass
     #692412 emerge IndexError for ambiguous package atom with pypy
    
    Bug: https://bugs.gentoo.org/691278
    Bug: https://bugs.gentoo.org/463952
    Bug: https://bugs.gentoo.org/646090
    Bug: https://bugs.gentoo.org/690484
    Bug: https://bugs.gentoo.org/690786
    Bug: https://bugs.gentoo.org/691776
    Bug: https://bugs.gentoo.org/691638
    Bug: https://bugs.gentoo.org/692024
    Bug: https://bugs.gentoo.org/692262
    Bug: https://bugs.gentoo.org/692412
    Package-Manager: Portage-2.3.71, Repoman-2.3.17
    Signed-off-by: Zac Medico <zmedico@gentoo.org>

 sys-apps/portage/Manifest              |   1 +
 sys-apps/portage/portage-2.3.72.ebuild | 264 +++++++++++++++++++++++++++++++++
 2 files changed, 265 insertions(+)
Comment 19 Zac Medico gentoo-dev 2019-11-03 22:39:57 UTC
*** Bug 628652 has been marked as a duplicate of this bug. ***