First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 150031
Alias:
Product:
Component:
Status: NEW
Resolution:
Assigned To: Portage team <dev-portage@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Arne Babenhauserheide <arne_bab@web.de>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 150031 depends on: Show dependency tree
Bug 150031 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)


Not eligible to see or edit group visibility for this bug.








View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2006-10-04 01:40 0000
It might be useful, if useflags would be appended to the package names (those
which are set for the  package), so one could easily keep and test packages
with different useflag combinations. 

This would also make it possible to create repositories of binary packages, for
example for managing a set of office pcs, which mostly have the same
configuration, but differ in a couple of useflags (for example you could have a
"scan" pc and a "printer" pc, but they could get the packages from the same
rep). 

Architectural and compiler specific differences can easily be taken into
account by using a directory-structure.

------- Comment #1 From Marius Mauch (RETIRED) 2006-10-08 07:18:09 0000 -------
While I agree that we need more info about a binpackage in it's name simply
appending active use flags isn't the way to go.

------- Comment #2 From Arne Babenhauserheide 2006-10-10 06:58:50 0000 -------
What is nother way? 

------- Comment #3 From Jakub Moc (RETIRED) 2008-02-17 21:04:07 0000 -------
How about WONTFIX pending a more sane suggestion?

------- Comment #4 From Arne Babenhauserheide 2008-02-25 08:08:00 0000 -------
I just changed the summary, so that this bug now simply reads 

"Include more info about a binpkg". 

How this should be archieved doesn't seem to be clear yet, but as I understand,
it is consensus among the three who answered in here, that binpkgs should
somehow contain more information. 

------- Comment #5 From Arne Babenhauserheide 2008-02-25 08:17:06 0000 -------
One more idea: 

Store binpkgs on a directory which contains a file with information about their
associated useflags. 

Binpkgs which fit more than one directory, because their specific useflags
didn't change, could just be created in the first one and then hardlinked into
the later created directories. 

The directory name could just be the date when the useflags where last changed
(and binpkgs created). 

This would mean, that portage would have to check on binpkg creation, if the
file with information in the most recent binpkg dir contains the curent
useflags. 

If it does, it could just use the directory. 

If it doesn't, it would have to check inside the file for the useflags of each
package. If they aren't affected, it could simply hardlink the package into the
new directory. 
If they are affected, it simply wouldn't hardlink them in. 

That way there would always be a directory with current binpkgs which would
still look quite clean. 

The problem I see with this approach is, though, that creating binpkgs for
several sets of different machines wouldn't be solved by this. 


One alternative I see would be, to take a hash of the useflags as directory
name and to set a symlink on the most recent directory for easy reference. 

What do you think? 

Is bugzilla the right place to discuss this? 

------- Comment #6 From Arne Babenhauserheide 2009-01-01 19:57:28 0000 -------
How about saving useflags in a directory structure like the following: 

machine_type/
        active_package_useflags/
                packages with same active useflags

        other_active_package_useflags/
                packages with these active useflags. 

If the length of the list of active useflags exceeds a certain threshhold (i.e.
32 chars - the length of a sha1 hash in base32 encoding), it could be replaced
by a sha1 hash over the full list. 

To check if we have a binpackage for a certain package, we only need to get the
active useflags for the package (for example sorted alphabetically) and join
them to a string. If the string is longer than 32 chars, we then compute the
sha1 of the string in base32. 

Then we just check if the dir 
machine_type/use_string/
contains a binpackage we need. 

The full path to a binpackage could then look like this: 
/var/packages/x86_64-pc-linux-gnu/2BFXV4EC5B25XH6B7X7OXJ34GOCM2KM2/dev-lang/python-2.5.2-r8.tbz2

(2BFXV4EC5B25XH6B7X7OXJ34GOCM2KM2 is the Uppercase sha1 hash of "berkdb doc
examples expat gdbm ipv6 ncurses readline sqlite ssl threads tk", and I used
the CHOST for the machine type)

Should the machine type also include the gcc version, the glib version or
similar? 

------- Comment #7 From Zac Medico 2009-01-01 22:02:11 0000 -------
The layout should be more like
$PKGDIR/$CATEGORY/$PN/$SLOTS_HASH/$USE_HASH/python-2.5.2-r8.tbz2 since we
should support an arbitrary number of "slots". Slotable variables include
things like, CHOST, multilib ABI, python version, perl version, and any other
$LANGUAGE for which slotting makes sense. In addition to slotables, ideally the
new layout should account for subpackages as well. Subpackages will allow you
to have a binary package that's split into an arbitrary number of subpackages
that separate the package into parts that can be installed separately.

------- Comment #8 From Arne Babenhauserheide 2009-01-01 22:56:47 0000 -------
Do you have an idea how the binpackage can support subpackages? 

Could that be done via symlinks and install-information? 
Or just install information with the info which files from which binpackage are
needed? 

Do binpackages have to be safe or can they be identified via an ID file which
contains binpackage and script hashes (or something different) for safe
installation? 

But why do we need subpackage support? 

Every binpackage fits an ebuild, and ebuilds can be as small as needed. So
subpackage support can simply be done via meta-ebuilds or similar. 

We don't need packages which are smaller than ebuilds, I think. 

------- Comment #9 From Arne Babenhauserheide 2009-01-28 10:45:34 0000 -------
From a discussion in IRC (#gentoo-kde) we found one advantage of putting the
hashes into the filenames: 

$PKGDIR/$CATEGORY/$PN/python-2.5.2-r8-$SLOTS_HASH-$USE_HASH.tbz2

Advantages: 
* The files can just be shared without having to preserve the directory
structure. 

Also metadata could be added to the tail of the tar archives, so all data is
preserved even when the filenames get lost. 

------- Comment #10 From Arne Babenhauserheide 2009-01-28 11:46:08 0000 -------
We just found out that xpak already stores metadata in the binpackage, so we'd
only need to add the filename or path changes and Gentoo could use a
transparent binary layer :) 

(sounds a bit too hypeable, but I currently find no better word for that
feature ;) )

------- Comment #11 From Arne Babenhauserheide 2009-01-28 12:10:07 0000 -------
I just learned why subpackages would be nice: seperate packages into sets for
different USE flags. 

But to me it looks like that would be a lot harder than just the simple changes
to allow for a simple binpackage structure needed to allow for seperate
binpackages with different useflags and other SLOTs. 

------- Comment #12 From Fpemud 2009-01-30 06:51:07 0000 -------
I'd like this porblem to be solved too. 
I don't like to compile one package twice, it is time consuming.


As I know, a binpkg depends no only USE FLAG, but also: CFLAG,CXXFLAG,LDFLAG
etc.

I think the structure of /var/db/pkg is good for this.
/var/db/pkg is the database to track installed packages.

/var/db/pkg's structure:
  /var/db/pkg
    |----dev-util
          |----cvs-1.12.12-r4
                |----CBUILD
                |----CFLAGS
                |----CHOST
                |----CONTENTS
                |----COUNTER
                |----CXXFLAGS
                |----DEPEND
                |----EAPI
                |----FEATURES
                |----IUSE
                |----KEYWORDS
                |----LDFLAGS
                |----cvs-1.12.12-r4.ebulid
                ...

I think the binpkg repository's structure would be like:
  binpkg_repository
   |--dev-util
      |--cvs-1.12.12-r4
         |--binpkg1
            |--CBUILD                   // i686-pc-linux-gnu
            |--CHOST                    // i686-pc-linux-gnu
            |--CFLAGS                   // -O2 -march=i686 -pipe
            |--CXXFLAGS                 // 
            |--KEYWORDS                 // alpha ~amd64 ...
            |--IUSE                     // crypt doc emacs ...
            |--LDFLAGS                  // -Wl,-O1
            |--cvs-1.12.12-r4.tbz2      // binpkg file
         |--binpkg2
            |--CBUILD                   // i386-pc-linux-gnu
            |--CHOST                    // i386-pc-linux-gnu
            |--CFLAGS                   // -O2 -march=i386 -pipe
            |--CXXFLAGS                 // 
            |--KEYWORDS                 // alpha ~amd64 ...
            |--IUSE                     // crypt  ...
            |--LDFLAGS                  // -Wl,-O2
            |--cvs-1.12.12-r4.tbz2      // binpkg file

note:
1. the name of "binpkg1","binpkg2" is to be discussed, i don't know what it
should be currently.
2. i don't know which of CHOST and CBUILD represents the destination
architecture, the dest arch should be put in the binpkg repository, we don't
need src arch 

------- Comment #13 From Fpemud 2009-01-30 07:01:25 0000 -------
when emerging a package, portage first check if current settings are same as
the settings in the "binpkg1","binpkg2" directory. 
if same, portage use the tbz2 file directly, or portage will comiple the
package, and creates a new "binpkg3" dir, put the new generated tbz2 file and
current settings in.

------- Comment #14 From Arne Babenhauserheide 2009-01-30 08:06:09 0000 -------
I think that binpackages should be easily shareable, and that wouldn't be the
case with all the single files floating around. 

But the files in /var/db/pkg show all the necessary information which has to be
included (excepting information which is in teh ebuild. We emerge stuff via
ebuilds, so any information which is already in the ebuild can be left out). 

For this information we need to be able to do the following two seperable
actions: 
- Check the environment for one given binpackage 
- Find a binpackage which fits our environment. 

The first can already be done via the xpak tail of the binpackage tar archives,
so there's no need to change anything for that. 

In the second one we don't need to be able to read the information. We just
need to be able to check, if it fits our system. To archieve that, we can just
store a hash of the environment in the filename or path of each binpackage. 

To optimize this, I would seperate it into two parts: One which doesn't change
very often and can be used to identify one type of system (i.e. amd64 with
standard optimizations) and one which varies from user to user (i.e. USE
flags).  (this very clean idea for this comes from Zac)

The first parts gives the "SLOT", the second part the active USE flags of the
package. 

To make it more human readable, I'd first turn the part (i.e. "SLOT" or USE
flags) into a string and only hash it, if that string is longer than the hash
would be. 

As Hash I'd use sha1 encoded as Base32, uppercase, since sha1 is quite safe
against collisions and Base32 doesn't contain any characters which have special
meanings in package names. 

------- Comment #15 From Philipp Riegger 2009-03-24 22:42:07 0000 -------
I would prefer to have the USE-flags on a per-package basis, because they
really depend on a package. Therefore some hierarchy like
.../category/package/use-flags/package-version.tar.bz2 or
.../category/package/package-version.use-flags.tar.bz2 would be nice. I would
prefer the last one.

The USE-flags could be managed in the following way: Require a revision bump if
a package changes IUSE (might be hard if an eclass changes something, but
definitely doable). Sort the USE-flags and create a binary string with a 1 for
every enabled flag and a 0 for every non enabled one. Well, this is not a
string, ist a binary number with possibly leading zeroes, compress those into
hex or base64 or whatever and use it to indicate the USE-flags. For most
packages the number of use-flags is small, for some with lots of USE-flags the
filename length would be increased by 1 for every 6 USE-flags. Should be ok, I
think.

Another thing that might be needed is a special binary revision. If you have
package foo, that debends on lib bar, and a new version of the lib hits the
tree with a different ABI (or whatever, that foo needs not be changed but
recompiled), it would be nice to indicate that foo should be reinstalled, by
increasing the binary revision.

------- Comment #16 From Arne Babenhauserheide 2009-03-25 08:42:20 0000 -------
The problem I see with bitwise use string is that it will never be human
readable. 

The binary revision is something else, because this kind of dependency doesn't
get hand-edited by the user, so it doesn't benefit from being user-readable. 


I think when including this we'd have three seperate elements: 
* Local system settings: Active USE-flags. 
* Binary compatibility requirements: The necessary libs and dependencies. 
* Host SLOT: Hardware requirements. 

A package with wrong USE flags can be installed anyway, a package with wrong
binary requirements just won't work, though, so these should be kept seperate. 

The Host SLOT signifies a cathegory which fits for all packages created by a
certain computer and shouldn't change as long as no libraries change in a way
which affects the whole system (like the glibc major version), the user doesn't
play with CFLAGS, and similar (like installing a new CPU :) ). 


Does this seperation sound useful, or did I overlook/misunderstand something? 


Besides: From what I see in a random binpackage (fretsonfire), the IUSE part of
the xpak looks like the right part for the USE flag info of the name, so we
could just use that directly without having to do much error-prone conversion. 

Remember that the binpackage name just needs to have an ID with which it can be
found; it doesn't need to include anythign which can be found in the ebuild,
since anyone installing a binpackage should also have the corresponding ebuild. 

The requirement for each ID element of the name is, that it can be generated
directly by any Gentoo installation which only knows the ebuild and the already
installed libraries. 


Question: What do we do if we don't already have one of the binary
requirements? How can we then find the correct binpackage? 

Does the info contained in a binpackage suffice? What I see (in
openarena-0.8.1.tbz via vim) is: 


CXX: x86_64-pc-linux-gnu-g++

NEEDED: /usr/games/bin/openarena-ded libdl.so.2,libm.so.6,libc.so.6
/usr/games/bin/openarena
libSDL-1.2.so.0,libpthread.so.0,libGL.so.1,libvorbisfile.so.3,libvorbis.so.0,libogg.so.0,libdl.so.2,libm.so.6,libc.so.6

CFLAGS: -O2 -pipe -march=k8

NEEDED.ELF.2:
X86_64;/usr/games/bin/openarena-ded;;;libdl.so.2,libm.so.6,libc.so.6
X86_64;/usr/games/bin/openarena;;;libSDL-1.2.so.0,libpthread.so.0,libGL.so.1,libvorbisfile.so.3,libvorbis.so.0,libogg.so.0,libdl.so.2,libm.so.6,libc.so.6


Also there seems to be some binary "DEPEND". I don't know what it contains,
though, so I can't judge if it would be needed for binary compatibility. 

Can we get these values from our own system (fast enough -> without building
the package ourselves), so we can just include a hash with which we can find a
fitting binpackage and the right libs for it? 

And if yes: Which of these values have to be included as hash in the package
name for binary compatibility, and which are systemwide, so they can be
included in the Host SLOT? 

------- Comment #17 From Philipp Riegger 2009-03-25 12:53:05 0000 -------
I'm not sure what you mean by "user readable" is you consider a hash over a
long string more readable than a packed presentation of that string. The last
one, could be converted to the user readable format, not the first one.

If you write about CFLAGS: Maybe it would be wise to leave CFLAGS out, that
don't do anything (at least concerning the binary). Something like
"-march=prescott -mmmx" and "-march=prescott" are the same, "-pipe" only
changes the behaviour of gcc, not the created binary.

I usually try to make gentoo behave like other binary distros when it comes to
binary packages. And they only have the version and this "binary revision" or
whatever and maybe the right dependencies. But to make it usable, we don't even
need dependencies and which LIBS are needed and all that stuff. Portage checks
that USE-flags and all that stuff matches and then the package with the biggest
binary revision is installed. That works without lots of magic and foo and
before we discuss here more and more, we shoudl maybe implement a simple
working system instead of trying to create a complicated, error-prone one.

------- Comment #18 From Arne Babenhauserheide 2009-03-25 13:15:20 0000 -------
By usser-readable I mean the default string representations of the active
useflags, which are only hashed, if they get longer than the hash (that's what
I proposed above). 

That way a package with three active USE flags just has these USE flags in the
package name, while a package with many active USE flags (like mplayer) has a
simple hash. 

The advantage here is that only the active USE flags are tracked - for most
packages simple no USE flag, for most others only one or two. 

For the Host SLOT I honestly don't care much, but I think that a hash over the
string will be far easier to write and especially to maintain. Also a hash has
a guaranteed length, which isn't true for a bitmask (though the difference in
length might make that difference fall away). 

I think that CFLAGS should be in, though, since some users have kinda crazy
settings in there which only work for their specific setup. 

The problem with binary dependencies is, that the ebuild might state that a lib
is compatible, since it is recognized in the configure process, but it can well
happen that you have to revdep-rebuild to make your programs work again after a
library update. 

If you do a binary installation, you can't just rebuild with the new lib, and
the two binary packages (one for old lib one for new lib) must not be mixed -
so they should have different names and the names should make it possible for
portage to decide which binary package to install. 

Gentoo isn't just a binary distro with a fixed set of packages which all use
the same libs. Different from Ubuntu and similar the basic libraries can get
updated while all the rest stays version-fixed. 

------- Comment #19 From Philipp Riegger 2009-03-25 13:34:16 0000 -------
Your hash-system has 2 disadvantages: Binary packages of the same source
package are scattered all over the tree (directory tree). Also, if you have the
hash, you are less than unreadable for the user. For my system, you write 2
simple conversion functions and can have command line tools, web interfaces,
everything you need. It's all the same, no special cases. Also, I don't see any
advantage in only tracking active USE-flags. It mages a packed representation
harder.

CFLAGS are a story of their own, that's for sure. Maybe one should create an
assembler file with -fverbose-asm and record those CFLAGS and not the ones from
the make.conf.

Last about the library stuff. If you update a library and therefore need to
update an application, we proposed 2 ways: You want to save library
dependencies in the file (which might not be enough, since the ABI might change
without the filename, which really sucks) and then you have to calculate,
download several packages or the metadata of them, check which is usable,
install it. In my version the build system would create the new package for the
lib, check which application is broken (ok, lots of apps need to be installed
here) and then just build new packages for them. It would then increase the
binary revision by one and commit all the packages in 1 transaction to the
server. So, whenever you get the chance to install a lib that might break some
application, either you installed the app froma binary and therefore you get a
new binary package, because the binary version is higher, or you compiled the
package on your own and have to reinstall it on your own.

And, last but not least, there is still the preserved libs feature of portage
which can help here.

------- Comment #20 From Arne Babenhauserheide 2009-03-25 15:33:17 0000 -------
> In my version the build system would create the new package for the
lib, check which application is broken (ok, lots of apps need to be installed
here) and then just build new packages for them. It would then increase the
binary revision by one and commit all the packages in 1 transaction to the
server. 

How do you make that work with multiple unconnected servers? 

What happens if you and I rebuild at the same time and send the files to
different servers? 

And what do you do if you don't have all applications installed which depend on
the lib? 

I'd like this system to be robust enough to support community-built binpackages
(with some trust system to ensure that you can always find the responsible
person if something breaks). 


For the USE flags I doubt that you would get any real gain by doing USE flags
bitwise. From what "eix -I" and some grepping and seding tells me, most
packages have at least two USE flags, but none of them is active. So using only
the active USE flags has no cost here, using a bitmask takes one char. 

Also the simple advantage is, that people can look at the binpackages and *see*
with which USE flags they were built. 


Last to the hash system: There is one binpackage per installed ebuild, just as
in the current system. 
The only change is that the name of the binpackages gets two extra parts
appended: USE flags and a Host SLOT. (Three with binary compatibility)

So it's the current system + a way to find matching binpackages for different
configurations. The binpackages themselves already contain ways to check, if
they can be used, but we currently can't search for this information
efficiently. 

Did you read the whole discussion in this bug? 


I am currently leaning more towards adding the USE hash and SLOTs hash to the
filenames, since these could be shared more easily, but using a directory
structure is cleaner. 

------- Comment #21 From Philipp Riegger 2009-03-25 16:07:53 0000 -------
(In reply to comment #20)
> How do you make that work with multiple unconnected servers? 

With some kind of locking, it can work.

> What happens if you and I rebuild at the same time and send the files to
> different servers? 

Nothing happens, because the servers are different. If you want to sync them,
you have to think of something.

> And what do you do if you don't have all applications installed which depend on
> the lib? 

You figure out, if you usually build binpackages for them and if you do, you
install your latest binpackage and check it.

> I'd like this system to be robust enough to support community-built binpackages
> (with some trust system to ensure that you can always find the responsible
> person if something breaks). 

The thing is, I discuss some kind of build server while you discuss the
distributed system. Each have different applications, but creating a binary
package that only supports one would be stupid. Therefore I suggest adding the
binary revision and handling the packages built by your distributed system like
live ebuilds, with -b9999 or something like that. Those yould be unmased with a
FEATURE or some other flag in the make.conf. Therefore one can decide to only
use the packages built on the trustworthy server or also use the packages built
by the distributed system.

> For the USE flags I doubt that you would get any real gain by doing USE flags
> bitwise. From what "eix -I" and some grepping and seding tells me, most
> packages have at least two USE flags, but none of them is active. So using only
> the active USE flags has no cost here, using a bitmask takes one char. 
> 
> Also the simple advantage is, that people can look at the binpackages and *see*
> with which USE flags they were built. 

It's difficult to talk about cost here, since it does not really exist. We get
a problem if filenames and paths get too long to be suported by http, the
filesystem or whatever. This happens with none of our representations.

You propose a mixed representation based on the combination on use-flags and a
hash. I don'T like it because it's not clean: Representation depends on what is
represented, binary packages for the same ebuild are scattered all over the
place. If a package gets a new USE-flag or loses one, the next binary package
might end up in a completely different directory.

How is that easily usable? How can a user find out if a new version is
available without the help of some tools? How much time and effort would it be
to delete binary packages because of security issues, license problems or other
reasons? How many different directories would you end up with and are they
supported on currently used file systems? Is the access fast enough to be
practical? 

> Last to the hash system: There is one binpackage per installed ebuild, just as
> in the current system. 
> The only change is that the name of the binpackages gets two extra parts
> appended: USE flags and a Host SLOT. (Three with binary compatibility)

How do you handle packages being built against different library versions? They
would end up with the same filename. And not every user can update every
library just to overcome a problem in the naming scheme.

> So it's the current system + a way to find matching binpackages for different
> configurations. The binpackages themselves already contain ways to check, if
> they can be used, but we currently can't search for this information
> efficiently. 

You could provide a cach file on the server. RPM uses some kind of index file. 

> Did you read the whole discussion in this bug? 

I guess so. I skiped uninteresting parts and parts, were the same was told
again and again. If you think I miss a special comment, please tell me which it
is. One big problem with this discussion is, that it was scattered between
bugzilla and the soc maininglist. I tried to keep it seperated, but it does not
seem to be possible. You refer to your distributed system before mentioning it
once in this bug.

> I am currently leaning more towards adding the USE hash and SLOTs hash to the
> filenames, since these could be shared more easily, but using a directory
> structure is cleaner. 

Other distributions use different directories for different SLOTS, how you call
them. I would stick to this. Gentoo also uses different directories for
different architectures. I would also prefer to have the USE-flag in the
filename, sionce that describes the package itself and not the "linux
distribution" (replace this with a better name, iff you want) which it belongs
to.

------- Comment #22 From Arne Babenhauserheide 2009-03-25 17:11:37 0000 -------
> If you think I miss a special comment, please tell me which it is. 

It's not one post but the three different approaches: 
* SLOT and USE in directory structure
* SLOT in directory, USE in directory
* Both in filename. 

And the reasons for using readable USE flags in the filename. 

> One big problem with this discussion is, that it was scattered between
> bugzilla and the soc maininglist. 

I didn't know that there was discussion on this in the soc list. 

Your you give me a link? 

> I tried to keep it seperated, but it does not
> seem to be possible. You refer to your distributed system before mentioning it
> once in this bug.

That is a long-term goal we talked about in IRC a few years ago and I seem to
have missed writing it here before... sorry for that. 

In short: It would be great if we had a way for users to get trusted binpackage
providers and for them to tell portage to use binpackages whenever possible and
to create and upload new ones, where the binpackages don't exist, yet. 

As soon as this works, it could be extended to a trusted p2p network in which
people simply share their binpackages and download those binpackages they need. 


The idea comes from the experience that sharing the distfiles in Gnutella led
to many people downloading them. 


> Other distributions use different directories for different SLOTS, how you call
> them. I would stick to this. Gentoo also uses different directories for
> different architectures. I would also prefer to have the USE-flag in the
> filename, sionce that describes the package itself and not the "linux
> distribution" (replace this with a better name, iff you want) which it belongs
> to.

The name SLOT comes from zmedico, though I'd love to be able to claim that it
was my idea :) 

By using a Hash, the SLOT can contain many different host type definitions, and
the system won't have to change a bit when the included information changes -
it's still just a hash over stuff we look for. 

Using a SLOT dir for machine type and USE flags in the name sounds also good to
me. 

With USE-fllags in the filename the files don't get scattered, by the way. 

It would look similar to this (needs to be checked against package naming
scheme, if it's compatible): 

SLOT1/
  portage-2.2_rc25-epydoc.tbz
  portage-2.2_rc27-epydoc.tbz
  python-2.5.4-r2-GYFBQKRKWSM67J6GO63XUE3JPDIJUSRA.tbz
  ...
SLOT2/
  ...

------- Comment #23 From Philipp Riegger 2009-03-25 18:52:34 0000 -------
(In reply to comment #22)
> > One big problem with this discussion is, that it was scattered between
> > bugzilla and the soc maininglist. 
> 
> I didn't know that there was discussion on this in the soc list. 
> 
> Your you give me a link? 

I somehow had identified you with the person I'm writing with on gentoo-soc.
You can find the thread at <http://archives.gentoo.org/gentoo-soc/>.

------- Comment #24 From Philipp Riegger 2009-03-26 14:04:58 0000 -------
(In reply to comment #22)
> In short: It would be great if we had a way for users to get trusted binpackage
> providers and for them to tell portage to use binpackages whenever possible and
> to create and upload new ones, where the binpackages don't exist, yet. 
> 
> As soon as this works, it could be extended to a trusted p2p network in which
> people simply share their binpackages and download those binpackages they need. 

Ok, but this is beyond the scope of this bug. Here we should discuss htings
that would make sense to be added to the binpackage as metadata which would
enable everything we want to do with it.

> By using a Hash, the SLOT can contain many different host type definitions, and
> the system won't have to change a bit when the included information changes -
> it's still just a hash over stuff we look for. 

The problem I see is, that it will become hard to find out, which binpackages
for your system are available. Maybe you don't care about USE-flags or one
about one special CFLAG, then you might want to find not the same SLOT, but
almost the same one. Would that be possible?

------- Comment #25 From Arne Babenhauserheide 2009-03-26 15:23:08 0000 -------
> Maybe you don't care about USE-flags or one
> about one special CFLAG, then you might want to find not the same SLOT, but
> almost the same one. Would that be possible?

If you don't care about USE flags, you just ignore the USE hash part of the
binpackages, so this is easy. 

But I see an advantage of your approach here: It would be easier to just
compare bitwise "does the package have the USE flags I need, I don't care about
additional capabilities". As much as I dislike losing readable USE flags in the
binpackage, this is a major advantage, since you can then do more complex
checks from the names without having to download the binpackages and checking
the xpak. 

But at the same time doing this on the binpackage level kills dependency
tracking, since a new USE flag can imply a new dependency which is needed for
getting the binpackage to run. This means you'll have to recalculate
dependencies for every alternate USE flag combination, which is so expensive
that any string conversion or hash algorithm pales in comparision. 

Just saying "I don't care about USE flags" only works for mostly selfcontained
packages, else you need to enable he right USE flags. 

For CHOST and similar (the Host SLOT): How do you decide what is unimportant? 

Anything which isn't a Hash is in danger of becoming arbitrarily long if you
need to include more information, and if you leave stuff out, you take away the
option of using that information - and in fact force users to use packages with
non-fitting settings. 

With the Hash it is far easier to change the included information later on
without breaking earlier versions: Just change the information and hash again
and all resolution will still work, even for old versions (they will just not
see the new packages as being compatible, but they won't get false
information). 

The only thing we need is being able to find the files based on known strings. 

To allow for more complex comparisions with hashes we'd just have to hash each
combination of active USE flags -> since most packages have just two or three
USE flags there are only a few possible combinations. And since we'll have to
recalculate dependencies anyway if we use different USE flags, the cost of
doing multiple hashes is negligible in comparision. 

First Last Prev Next    No search results available      Search page      Enter new bug