Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 23747 - Portage improvement thorugh binary packages p2p sharing.
Summary: Portage improvement thorugh binary packages p2p sharing.
Status: RESOLVED WONTFIX
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - External Interaction (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on: 8468
Blocks:
  Show dependency tree
 
Reported: 2003-06-30 03:11 UTC by Stefano Peluchetti
Modified: 2015-11-27 07:59 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefano Peluchetti 2003-06-30 03:11:55 UTC
I know that suggestions similar to this one had alrady been discussed a lot of
time, but before you start complaining please read all i have to say.
The portage system is really powerfull. I'm not using Gentoo becouse i think
that optimized packages for my system will be a lot faster than precompiled ones
(well, maybe :D ), but becouse:
1)  the stable branch is really stable but not too outdated (see Debian)
2) i can configure my gentoo installation as i really want (only the packages i
want, the way i want)
3) if i have to install packages like mplayer , or the latest loster, i can do
this with no trouble at all
The only drawback is the compiling time. Yes, i can compile in the night, but
sometimes i really need to have a package installed immedialy, of simply i don't
want to wait ages to have kde compiled on my not too powerfull laptop. 
Finally here is the suggestion:
The idea is to create a p2p network where gentoo users can transparently share
the already compiled packages. 
Considering one package "name_version" (let's say kdebase_3.1.2) this can be
different from the others already compiled one (samepackage_sameversion) only in
three ways:
I) used optimizations
II) used USE flags
III) linked library's versions used (let's say it may be compiled using pam_0.73
or pam_0.74 as the dependency rule only states => pam_0.73)
IV) gcc compiler used
On point I i think that a lot of gentoo users will be happy to renunce using
exotic CFLAGS settings, it they can get the packages already compiled THE way
thay want with "standard" optimizations like CFLAGS="-march=athlon-xp -O3 -pipe
-fomit-frame-pointer". Also the gentoo developers team is moving in the
direction of defining some "stable&standard" predefined cflags settings for
various cpu, as can be inferred from the "CFLAGS/cpuinfo collection project".
Point II should not be that great problem, as the possible combinations of
optional USE flags in packages is not that high number. Also it usally happend
that there are some "most used" USE flags combinations.
Point III is probably the major one. We have to distinguish between
kdebase_3.1.2 compiled upon pam_0.73 and kdebase_3.1.2 compiled upon pam_0.74
(and also between different releases). 
Point IV is not a problem as the gcc version used in gentoo doesn't change
frequently.
All this system should be based on an existing p2p sharing protocol (maybe
adapted) that has to be efficient, fast, scalable, and working also with
firewalled users (there are a lot....) (am i asking too much? :D ).
The key point is the number of users adopting this feature (it would be nice to
have an entry make.conf under features , like "binary_sharing"=yes), that should
be enaugh large to cover all the needed combinations. The good news is that
gentoo's user base is currently growing, with the popluraty of the distro, so i
think that using this system a user shouldn't have any problem in findind the
packages it need with the right characteristic for popoular (and not only)
programs. Kde hell could end! :P
All the sharing have to be completely transparent to the user! So, if a user
enables this feature in make.conf , the following two things will happend:
A) for every ebuild compile, two files will be generated under an arbitrary
directory:
	1) the binary package (ebuild with -k option)
	2) the description (.desc maybe?) file, with the same name of the binary
package, cointaing the three infos needed (optimizations, USE flags,
version/revision of the packages the package depend upon).
This way the user actively contribute to the gentoo sharing comunity in a
transparent way [an algorythm that deletes obsoletes file shold be added
later......]
B) for every ebuild compile, portage automatically search with a query on the
p2p gentoo comunity if the requested package is already present, with the
required characteristic. If so it download it and install it.
A md5 check should be added later for security reason.
I know that a lot more should be added and discussed, but i need to know what
you think of this idea. I also have to add that i don't have the programming
experience needed to be a "first line developer" that directly modify portage
and the sharing protocol, but i want to contribute for what i'm able to.
I really think that ,as the user base is already very large, this project would
be succesfull.
This enhancement could improve a lot the gentoo experience, given the reduced
package's installation time, while preserving all the portage excellent features.
For the brave who dared to read me till this point, i thank them! (also sorry
for my bad english....)
Hope that this contribution will help the gentoo comunity.
Bye

Stefano Peluchetti
Comment 1 Ben Wilkerson 2003-06-30 10:53:50 UTC
So what happens if a package is incorrectly compiled, where the compile succeeds but the package does not work? Is there some way to test before setting it up for sharing? We don't want a package to be distributed that causes a segmentation fault when run, across the entire p2p network, that would cause chaos. 
Comment 2 Stefano Peluchetti 2003-07-03 14:35:14 UTC
If a package compile but segfault when launched, then it would cause chaos anyway, as it should do the same on every computer that uses the same configuration/cpu.
It would be a fault of the ebuild, so there would be no change at all from what could happend right now.
Comment 3 Fred Van Andel (RETIRED) gentoo-dev 2003-10-11 23:36:23 UTC
From a security point of view this is a VERY bad idea.  

A md5 will only tell you if what is sent is the same as what is received.
 A md5 cannot tell if the executable has been altered or has been trojaned.
 Any p2p distrubution system must have a centrally generated md5 that guarentees
thet the peer downloaded package matches the package on the mirrors.  It
is simply not feasable to do this for user compiled binary packages because
there are too many possible combinations of use flags, CPUs and compiler
options.

Under no circumstances should a system like this ever be implemented, its
just too dangerous.



Comment 4 Nicholas Jones (RETIRED) gentoo-dev 2003-12-28 22:33:34 UTC
I am not a fan of a generalized p2p package sharing system
for portage. Supporting distfiles and GRP is a possibility,
but I will not support general community servers like this.
Comment 5 David Baird 2004-06-27 13:44:06 UTC
I am disappointed that the portage developers don't think p2p is a 
good idea.  The people at Zynot, http://zynot.org/, think otherwise
though.  I like the idea of p2p ebuilds and packages.  ...and when I
start getting super annoyed that Gentoo doesn't offer such a feature,
then I will probably start writing it myself!

Basically, it is possible to establish trust and eliminate bad content
in p2p networks -- but perhaps it is not trivial to do this.  Here are 
some of the problems to worry about:

 * Establishing trust

 * Establishing quality

 * Keeping track of many, many versions of the "same" packages and
   ebuilds (many people may have written their personal version of an
   ebuild for some program, say spread at http://spread.org/)

 * Tracking bugs

Here are just a few ideas:

 * Establish authorities to sign off on ebuilds and packages.
   Basically, if you trust the authority that did the signing, then
   you can trust the package that was signed.  You could even have a
   package be signed by many authorities to indicate stability, security,
   compatibility, etc..

 * Usage voting -- anytime someone emerges a particular ebuild/package,
   then it gets a vote (to give an idea of how many people use certain
   packages, and perhaps to locate popular versions of a package)

Why would you want to consider p2p?  Well, try these reasons:

 * Portage is incomplete (ie. missing lots of ebuilds for useful
   programs)!  I have to write my own ebuilds or install stuff in
   /usr/local all the time.  Wouldn't it be great if I could easily share
   my ebuilds so that other people didn't have to repeat my hard efforts?

 * In-house-software: perhaps some people have some software that they
   write in house and the rest of the world doesn't care about.  Why
   should these have to become part of the portage tree?  Of course you
   could use portage overlay, but if portage had some p2p functionality,
   it would be nice if it could handle this scenario.

 * University, corporate, or cluster installations could benefit from
   precompiled package p2p, or they could benefit from ebuilds 
   that are custom tweaked to support special requirements of the
   university/corporation