$ tar -tf awscli-1.25.56-1.gpkg.tar gpkg-1 app-admin/awscli-1.25.56/metadata.tar.bz2 app-admin/awscli-1.25.56/image.tar.bz2 Manifest Per the spec, it should actually be: awscli-1.25.56-1/gpkg-1 awscli-1.25.56-1/metadata.tar.bz2 awscli-1.25.56-1/image.tar.bz2 awscli-1.25.56-1/Manifest
It's actually designed to be like this. gpkg-1 is the first file without anything as the magic number, so the gpkg file should always begin with b"gpkg" bytes. Manifest is the package independent verification data, and is not bound to the package. It will be first checked before all other files, so it had to be in a constant place. Notice the package CPV could be unknown (like rebuild Index), so it cannot be put in somewhere like awscli-1.25.56-1/Manifest.
(In reply to Sheng Yu from comment #1) > It's actually designed to be like this. > > gpkg-1 is the first file without anything as the magic number, so the gpkg > file should always begin with b"gpkg" bytes. > > Manifest is the package independent verification data, and is not bound to > the package. It will be first checked before all other files, so it had to > be in a constant place. Notice the package CPV could be unknown (like > rebuild Index), so it cannot be put in somewhere like > awscli-1.25.56-1/Manifest. In that case, you have submitted the wrong text to the GLEP. Furthermore, your idea violates the original principles of the GPKG format (that everything ends up in a single predictable directory), so what you're suggesting wouldn't get my approval anyway.
(In reply to Michał Górny from comment #2) > (In reply to Sheng Yu from comment #1) > > It's actually designed to be like this. > > > > gpkg-1 is the first file without anything as the magic number, so the gpkg > > file should always begin with b"gpkg" bytes. > > > > Manifest is the package independent verification data, and is not bound to > > the package. It will be first checked before all other files, so it had to > > be in a constant place. Notice the package CPV could be unknown (like > > rebuild Index), so it cannot be put in somewhere like > > awscli-1.25.56-1/Manifest. > > In that case, you have submitted the wrong text to the GLEP. Furthermore, > your idea violates the original principles of the GPKG format (that > everything ends up in a single predictable directory), so what you're > suggesting wouldn't get my approval anyway. The request of identifier and magic number ``gpkg-1`` exists before I modified draft. So it not possible in a single directory for the beginning. Someone (not remember who) explicitly tells me that gpkg-1 should be the first file, so it can be detected with tools like `file`. https://gitweb.gentoo.org/data/glep.git/tree/glep-0078.rst?id=e22589cdba6b1f4f8b0eec072e2661b329520e8a For the basename, I was thought it was refer to the CPV. However, if you want to it match the package file name, that is problematic. For example: 1. Portage will first try create awscli-1.25.56.gpkg.tar.XXXXXX (or xpak), no build_id exists. So you will get awscli-1.25.56/gpkg-1 awscli-1.25.56/metadata.tar.bz2 awscli-1.25.56/image.tar.bz2 awscli-1.25.56/Manifest 2. This file will be renamed to awscli-1.25.56-1.gpkg.tar, but still no build_id in metadata. 3. Then update metadata will be called and add build_id to it. To match the filename, it has to rename all related old names to new names, as well as update the Manifest. 4. Update the Index file. And update metadata can be called multiple times for different reasons. Each time it has to check if needs to rename something which may or may not has build_id or the package is renamed to something else. The other problem is filename can be duplicate in different category, so without CPV, you can overwrite others when manually extract. The most obviously is acct-user / acct-group which often has the same version. There is also kind of security risk to put key files under variable paths.
(In reply to Sheng Yu from comment #3) > The request of identifier and magic number ``gpkg-1`` exists before I > modified draft. So it not possible in a single directory for the beginning. > Someone (not remember who) explicitly tells me that gpkg-1 should be the > first file, so it can be detected with tools like `file`. > > https://gitweb.gentoo.org/data/glep.git/tree/glep-0078. > rst?id=e22589cdba6b1f4f8b0eec072e2661b329520e8a The directory is not a problem for that, at least as long as it doesn't exceed USTAR's name field. I don't think we really need to account for the latter. It's just a matter of matching `.*/gpkg-1` regex in the first 100 bytes. > For the basename, I was thought it was refer to the CPV. However, if you > want to it match the package file name, that is problematic. For example: > > 1. Portage will first try create awscli-1.25.56.gpkg.tar.XXXXXX (or xpak), > no build_id exists. > So you will get > > awscli-1.25.56/gpkg-1 > awscli-1.25.56/metadata.tar.bz2 > awscli-1.25.56/image.tar.bz2 > awscli-1.25.56/Manifest > > 2. This file will be renamed to awscli-1.25.56-1.gpkg.tar, but still no > build_id in metadata. > 3. Then update metadata will be called and add build_id to it. To match the > filename, it has to rename all related old names to new names, as well as > update the Manifest. > 4. Update the Index file. > > And update metadata can be called multiple times for different reasons. Each > time it has to check if needs to rename something which may or may not has > build_id or the package is renamed to something else. Sounds like we need to change the generation process then. I don't really see a reason why Portage couldn't decide on build-id earlier and reserve the filename. > The other problem is filename can be duplicate in different category, so > without CPV, you can overwrite others when manually extract. > The most obviously is acct-user / acct-group which often has the same > version. That's not really a problem. > There is also kind of security risk to put key files under variable paths. Could you explain?
(In reply to Michał Górny from comment #4) > The directory is not a problem for that, at least as long as it doesn't > exceed USTAR's name field. I don't think we really need to account for the > latter. It's just a matter of matching `.*/gpkg-1` regex in the first 100 > bytes. What is the problem with actually defining gpkg-1 to be the first entry, named like that? It means it would always succeed, and make it much easier to identify a file as gpkg format. > > For the basename, I was thought it was refer to the CPV. However, if you > > want to it match the package file name, that is problematic. For example: `The archive contains a number of files, stored in a single directory whose name should match the basename of the package file.` Then the "package file", seems to be defined in `The gpkg package container is an uncompressed .tar achive whose filename should use ``.gpkg.tar`` suffix.` So, basename should be something like foo-1.3-1.gpkg.tar? Feels like an amendment to the GLEP is needed to specify which suffixes are to be removed.
(In reply to Fabian Groffen from comment #5) > (In reply to Michał Górny from comment #4) > > The directory is not a problem for that, at least as long as it doesn't > > exceed USTAR's name field. I don't think we really need to account for the > > latter. It's just a matter of matching `.*/gpkg-1` regex in the first 100 > > bytes. > > What is the problem with actually defining gpkg-1 to be the first entry, > named like that? It means it would always succeed, and make it much easier > to identify a file as gpkg format. The problem is that extracting such archive will create a top level 'gpkg-1' file which is against the principle of least surprise. > > > For the basename, I was thought it was refer to the CPV. However, if you > > > want to it match the package file name, that is problematic. For example: > > `The archive contains a number of files, stored in a single directory > whose name should match the basename of the package file.` > > Then the "package file", seems to be defined in > > `The gpkg package container is an uncompressed .tar achive whose filename > should use ``.gpkg.tar`` suffix.` > > So, basename should be something like foo-1.3-1.gpkg.tar? Feels like an > amendment to the GLEP is needed to specify which suffixes are to be removed. The intent was to strip both suffixes.
(In reply to Michał Górny from comment #4) > (In reply to Sheng Yu from comment #3) > > The request of identifier and magic number ``gpkg-1`` exists before I > > modified draft. So it not possible in a single directory for the beginning. > > Someone (not remember who) explicitly tells me that gpkg-1 should be the > > first file, so it can be detected with tools like `file`. > > > > https://gitweb.gentoo.org/data/glep.git/tree/glep-0078. > > rst?id=e22589cdba6b1f4f8b0eec072e2661b329520e8a > > The directory is not a problem for that, at least as long as it doesn't > exceed USTAR's name field. I don't think we really need to account for the > latter. It's just a matter of matching `.*/gpkg-1` regex in the first 100 > bytes. > > > For the basename, I was thought it was refer to the CPV. However, if you > > want to it match the package file name, that is problematic. For example: > > > > 1. Portage will first try create awscli-1.25.56.gpkg.tar.XXXXXX (or xpak), > > no build_id exists. > > So you will get > > > > awscli-1.25.56/gpkg-1 > > awscli-1.25.56/metadata.tar.bz2 > > awscli-1.25.56/image.tar.bz2 > > awscli-1.25.56/Manifest > > > > 2. This file will be renamed to awscli-1.25.56-1.gpkg.tar, but still no > > build_id in metadata. > > 3. Then update metadata will be called and add build_id to it. To match the > > filename, it has to rename all related old names to new names, as well as > > update the Manifest. > > 4. Update the Index file. > > > > And update metadata can be called multiple times for different reasons. Each > > time it has to check if needs to rename something which may or may not has > > build_id or the package is renamed to something else. > > Sounds like we need to change the generation process then. I don't really > see a reason why Portage couldn't decide on build-id earlier and reserve the > filename. Well, that is doable, but need to change a lot of logic. > > The other problem is filename can be duplicate in different category, so > > without CPV, you can overwrite others when manually extract. > > The most obviously is acct-user / acct-group which often has the same > > version. > > That's not really a problem. > > > There is also kind of security risk to put key files under variable paths. > > Could you explain? By putting Manifest in the inner directory, it is kind of giving up outer level integrity. Or more intuitively, that directory name doesn't matter at all. I can check that there can only be one directory and no other files are allowed. But there is no way to check what this gpkg file was like when it was created. So you can change awscli-1.25.56-1/ to AAAAA/ and it will pass the check. If you think this is okay, I can move them all in.
(In reply to Sheng Yu from comment #7) > (In reply to Michał Górny from comment #4) > > Sounds like we need to change the generation process then. I don't really > > see a reason why Portage couldn't decide on build-id earlier and reserve the > > filename. > > Well, that is doable, but need to change a lot of logic. Well, technically I don't think there's any real requirement that the build-id is sequential. Unless I'm mistaken, the only problem we care about is being able to have unique suffix for every binpkg generated. How about using some hash or timestamp, or even random suffix instead? At this point, I don't think you even really need to reserve it, as long as it's "random enough", i.e. unlikely to collide with prior binpackages. At this point, the highest build-id I have is -54, and that's for a package I've rebuilt (for testing) a lot. Normally build-ids are unlikely to exceed 10, except for live packages -- but we don't care about multi-instance there a lot. > By putting Manifest in the inner directory, it is kind of giving up outer > level integrity. Or more intuitively, that directory name doesn't matter at > all. I can check that there can only be one directory and no other files are > allowed. But there is no way to check what this gpkg file was like when it > was created. So you can change awscli-1.25.56-1/ to AAAAA/ and it will pass > the check. If you think this is okay, I can move them all in. Yes, that is fine. Per the GLEP, we don't really care how the directory is named, and the PM should be able to handle any directory name, valid or not. We only want to generate the correct top directory for user convenience but we must be handle to handle "malformed" packages.
Fix pending review / merge. https://github.com/gentoo/portage/pull/893
commit fa901a6510c4a5f72dec6aad86db6fe7efd6e7b3 (HEAD -> master, origin/master, origin/HEAD) Author: Sheng Yu <syu.os@protonmail.com> Date: Tue Sep 6 14:38:27 2022 -0400 Add BUILD_ID to metadata during binary packaging Also create placeholder for new multi-instance package before actually packaging to avoid race. Signed-off-by: Sheng Yu <syu.os@protonmail.com> Closes: https://github.com/gentoo/portage/pull/893 Signed-off-by: Michał Górny <mgorny@gentoo.org> commit 1d94992a2df2b5cc963c26c7978a899dc642deb1 Author: Sheng Yu <syu.os@protonmail.com> Date: Thu Sep 1 10:44:55 2022 -0400 Move all files into basename/DATA structure Signed-off-by: Sheng Yu <syu.os@protonmail.com> Signed-off-by: Michał Górny <mgorny@gentoo.org> commit 020feffb39cd6c4c68a4a81be6ccd2426b062e8d Author: Sheng Yu <syu.os@protonmail.com> Date: Fri Aug 19 16:40:59 2022 -0400 Add gpkg-sign tool to sign exist GPKG files. Signed-off-by: Sheng Yu <syu.os@protonmail.com> Signed-off-by: Michał Górny <mgorny@gentoo.org> commit 52411290c67535d94c7b20fa996ae7108014adfb Author: Sheng Yu <syu.os@protonmail.com> Date: Fri Aug 19 16:24:59 2022 -0400 Detect binary package format if not in database Signed-off-by: Sheng Yu <syu.os@protonmail.com> Signed-off-by: Michał Górny <mgorny@gentoo.org>
The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=06bf8d495327ebb5718821a27622f1b7eb944943 commit 06bf8d495327ebb5718821a27622f1b7eb944943 Author: Sam James <sam@gentoo.org> AuthorDate: 2022-09-09 10:55:57 +0000 Commit: Sam James <sam@gentoo.org> CommitDate: 2022-09-09 10:55:57 +0000 sys-apps/portage: add 3.0.36 Closes: https://bugs.gentoo.org/865905 Closes: https://bugs.gentoo.org/864160 Closes: https://bugs.gentoo.org/866087 Signed-off-by: Sam James <sam@gentoo.org> sys-apps/portage/Manifest | 1 + sys-apps/portage/portage-3.0.36.ebuild | 273 +++++++++++++++++++++++++++++++++ 2 files changed, 274 insertions(+)