Summary: | [IDEA] Offload work by distributing trivial ebuild maintenance to users, introduce a simple stability voting system and have a core team approve them to the Portage tree. | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | George Shapovalov (RETIRED) <george> |
Component: | [OLD] Unspecified | Assignee: | George Shapovalov (RETIRED) <george> |
Status: | RESOLVED OBSOLETE | ||
Severity: | enhancement | CC: | ajohnson, bugs.gentoo.org, george, lisa, lostlogic, m.debruijne, mettlerd, mholzer, mkrainer, pauldv, rigo, tcunha, tomwij, vapier |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | http://www.its.caltech.edu/~georges/gentoo/epsp/proposal.html | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 472746 | ||
Attachments: |
estatus.py - module providing ebuild status treatment
patch to portage.py to make use of estatus.py sample Status file sample Status file (update) |
Description
George Shapovalov (RETIRED)
![]() Some neat ideas, and I get the idea of what you were trying to say. I think you summed up the problems well; still mulling potential solutions over in my head... I have actually started some work on adding meet to the proposal. Please take a look at: http://www.its.caltech.edu/~georges/gentoo/epsp/update1.html for an update. I have abandoned the idea to mangle the ebuild names (to add a new aprt to it basically). Now the status information is supposed to be reside in a separate file. Generally much more flexible approach and easier to code as well :). Some code is supplied also. Please note that the code is in the *very* early stage, more in the proof of concept area (and not yet finished). Submitting this now to contribute to the discussion on the relevant topic on developer mailing list. George This is a code update. I completed the part responsible for stability levels in portage. Attached are: new version of estatus.py - the file that "does the job" patch to portage.py against cvs version 1.149, (committed May 16, seems there were no commits since then). In case patch does not apply cleanly (against your version of portage.py) the only significant chunk of additions should go to portagetree.domask Changes: I changed StabilityLevel var to be STABILITY_LEVEL to match make.* style. This var should be set in make.globals and overriden in make.conf as usual. The proper setting should be "new" until we add Status files to all packages in portage tree. George Created attachment 1086 [details]
estatus.py - module providing ebuild status treatment
Created attachment 1087 [details, diff]
patch to portage.py to make use of estatus.py
I am using exceptions in this code. It just makes no sence not to inherit estatus exceptions from PortageError! Thus this bug depends on #2328 and I mark the proper dependency here. Created attachment 1088 [details]
sample Status file
to have the full set of relevant files I include sample Status file (should
probably end up under /usr/portage)
This is great work! Just want to chalk my number up in favor of this getting included in the next MAJOR portage release! Thanks a lot! But did you test the code? ;) Please, please! :) This is really small chank of code and a quick test. Just in case, I'll repeat instructions which I posted to mailing lists here: The code consists of: estatus.py module that does most of the work and patch to portage.py - portage.py.diff Additionally, since the code uses exceptions for error handling and since portage_exceptions.py was not incorporated yet you will need that module too (portage_exceptions.py is in #2328 on which this bug depends). portage.py resides in /usr/lib/python2.2/site-packages/, you should apply the supplied patch against it (should apply cleanly against portage.py-1.149 in cvs and portage-1.9.12/13). Other two python modules should go into the same dir (or somewhere else where python can find them). File Status.skel contains explanation of the (very easy) Status file format. After you did this you are ready to use new feature. Few words first though: Since none of your packages have Status file available all your ebuilds will default to "new" status. Default setting for STABILITY_LEVEL is "approved", thus emerge will not see any ebuilds! To circumvent this you should set STABILITY_LEVEL="new" in your make.conf or make.globals (don't forget what you modify!). This should bring emerge functionality back to normal (all your ebuilds are "new" and you instructed portage to allow all ebuilds that are "new" or more stable). Now you can create Status file for some certain package and play with it by setting different combinations of flags described in Status.skel (and comments to the code). You can use this as an alternative to profiles, masking and even to create a sub-distribution, however the intended use is somewhat different. George leaving for weekend trip to MarCon but I managed to do a brief look at the code and some simple tests, looks nice... it would be nice if the words used in the Status file matched more closely with what you say in make.conf ie. perl-5.6.1-r4.ebuild confirmed new perl-5.6.1-r3.ebuild confirmed approved or somethign (just my 2 cents after 3 hours of sleep ;-)) Short note on the status switches. They are different because, well, they are :). The setting in make.conf is an "aggregate" ebuild stability level, one of: ["unstable","new","confirmed","appr-new","core-new","approved","core"]. This is in a sense an "abstracted" stability level setting. The Status file on the contrary contains more direct entries. At present it has two columns to reflect two categories of users: general users and core developers. This might be changed subject to appearance of compelling reasons to do such a change. Consequitevly setting in each columnt follows more direct and simple model: every group can mark package as: unstable, new (no), checked (yes) or core (this setting is reserved for developers at present but it is trivial to modify code to make it available in the second column). How did I come up with yes/no? These follow naturally if you try to answer the question "was the ebuild checked by that group?". I.e. filling the following table: ebuild_name Confirmed? Approved? PV-PR yes no (for example) Its just that yes/no seemed natural in that context. This is not a biggie, it can be changed to be checked/new for example (for a full set of [unstable,new,checked,core]). I would like to avoid using different identifiers for the same settings in different columns (i.e. confirmed vs approved, I used "ckecked" in both instead), this will make the code more complex for no good reason. Please tell me what do you think? Also I noticed you put full ebuild name in your example. Does this seem more natural than ${PVR} part only? (I like to *not* duplicate information if that can be avoided, - much harder to miss something in case of format change or some other modification). Again, that would be a trivial change to the code. Thanks for taking your time to check this out ! ;) George Created attachment 1134 [details]
sample Status file (update)
updated Status.skel: added two lines of actual sample
(so its 99% comments instead of 100% now :)).
Note: Status file will most often contain only a few descriptive lines,
generally no more than there are ebuilds in the package dir.
Hi! As the original intention of this bug was to simply transfer testing of new packages from the gentoo core team to the users, I think this could be achieved much simpler. Why not have a second file like package.mask e.g. package.new that lists all new versions of an ebuild. Example content: net-news/leafnode: 1.9.22 1.9.23 sys-devel/gcc: >=3.1 Now "emerge -u pkg" will treat this package versions the same way as if they are masked (i.e. ignore them) but the user can force installation with "emerge --test -u pkg" We can than encourage the user to "vote" for the ebuild (display a message when using --test) or to make a bugreport if the ebuild fails. If there are enough positive votes or no bugreports within a reasonable time, the package is simply removed from package.new thus integrating it into the "stable branch". -Markus- The status file is a good idea, I do see problems with old ebuilds though. What does the system do when soneone doesn't do emerge rsync --clean, but just emerge rsync an has a "stable" ebuild laying around that is removed from the status file? A solution I see could be to use a removal delay, that is, a version approval status is only removed after a certain time (say half a year). Another solution would be to maintain a full list somewhere on the net and allow for on-demand checking of versions older than the oldest ebuild in the status file. (Such way no big load should be created) Its still some time before herds proposal gets settled and implemented, nonetheless trying to keep this bug alive :). An ineresting project came around, mentioned on gentoo-dev mailing list. See thread Ebuild Janitor project, by May 11, 2003. Looks like it can fit nicely with overal idea of this proposal (naturally if it sustains, otherwise something like that can be set up when time is right..), throwing in my remark, so that it doesn't get lost. In short, these are the forums set up for the discusion of user-submitted and user-maintained (until they get accepted) ebuilds. This should serve as a nice complementary to gentoo-stats/stable improving overal level of user-submitted ebuilds.. From the email: " The way I see it, distributed ebuild processing requires: 1. some central depository with automated submission system doing at least some basic checks (multiple access/stability levels are already in portage, but that might need some further ratification) 2. strong voting/feedback system (think gentoo-stable/stats), looped back to ebuild level adjutment (some automatic some manual). 3.(new!) the way for users to be getting feedback on their ebuilds and be able to discuss related stuf with each other. Apparently best done in user-land, by the project such as the one you initiated. Should really do a lot in terms of increasing overall quality of submitted ebuilds." George A short note/update. Recent VCS discussion on -dev made me notice and read an overview of version control systems on o'reilly. One of the systems - Aegis, seems to perfectly fit the bill for the "automatic processing of outside submissions" (well, apparently it was designed for a similar purpose) as it " . Manages automated tests, prevents check-ins that do not pass the previous tests, and requires developers to add new tests. . Manages reviews of code. Check-ins must pass the review of a reviewer to get into the main line of development. " Plus this (from aegis site) might indicate scalability (the review article did not address scalability at all): "Aegis supports large teams and large projects." The review article is here: http://www.onlamp.com/pub/a/onlamp/2004/01/29/scm_overview.html osnews comments (mentioning few more systems) are here: http://www.osnews.com/comment.php?news_id=5858&limit=no and the Aegis project is here: http://aegis.sourceforge.net/ This should be glep'd and posted to -dev, and let them hash it out. Pretty much, modding portage to identify the 'status' of the ebuild, 'k, doable. The value of doing this, the resultant effects on QA, etc? Well, not portage dev's decision. So... bounce it to -dev, iron it out there (if they go for it), and then poke us with reqs if people agree to it. Re-open based upon the results from -dev please- marking it WONTFIX now though, since like I said above, the changes required to the tree are a community decision, not portage-devs. WTF was it closed while having a developer responsible for it by a dev not really related to the bug? Oh, btw, this has been discussed on -dev maany times, even few times on -core. Its been relatively quiet lately (well, like about another half year at this point), but we are not really there yet anyways. Gentoo-stats are lacking and any other voting system could not be agreed upon in prior discussions, infra involvement is not clear yet either, although the rest of the requested features are more or less there, with some tuning they should be useable for this purpose. But then, every half a year or so there is a blurb on -dev on how we or users could speed-up new submissions/bug squashing processing, etc; this bug is brought up and enough people seem to be interested in the idea. So, in short, now it does not seem like a right time to push on the thing but I keep the bug as a reminder of the idea. This thing *will* go through redesine, discussions, glepping and further discussions when the time is right. Meanwhile if the portage team does not want to see the bug please just remove the alias from CC. Reopening the thing. George Removing portage as requested. When this gets approved ( preferably as a GLEP ) please re-add us for implementation. Bug wranglers are not interested in such stuff. Take this to mailing list. Ugh, upstream is me, an active Gentoo developer, and I indicated that I would just like to hang on to this bug as a reminder. Why not reassign it back to me, instead of throwing it out to bug-wranglers? Now I have to reopen and reassign it. I'll post a reply to comment #19 in the reassignment action.. George In reply to comment #19. So, this circled around and finally got thrown out to bug-wranglers. What was the point in taking it from me in the first place? Well, admittedly I was almost tempted to have a fuzzy feeling when it got assigned to portage team :), but (sadly) I of course knew nothing would come out of it at this time, since, as I indicated, we are way off of making any practical use of this idea. Meanwhile I would just like to hang on to this bug as a reminder.. I added a tag to the subject to make the status of this bug (more) obvious. >When this gets approved ( preferably as a GLEP ) please re-add us for >implementation. First this has to be redesigned. A lot has changed since the original idea and brainstorming.. George Oh, forgot to add tags during this all bug bouncing. Adding now.. The original (updated) proposal has migrated to dev.gentoo.org (as proper).Please see this link instead of the one listed in original post: http://dev.gentoo.org/~george/epsp/proposal.html George (In reply to comment #2) > http://www.its.caltech.edu/~georges/gentoo/epsp/update1.html has relocated to http://dev.gentoo.org/~george/epsp/update1.html In general just substitute http://dev.gentoo.org/~george instead of http://www.its.caltech.edu/~georges/gentoo to get to the relevant files.. George Do the problems stated in this proposal still apply today? Does it refer to actual practical problems we still have? I think not. > Limited set of people are responsible for ebuild review. ... it breaks when the system becomes large and/or ebuild submission rate grows above some limit Only a limited set of people have the necessary knowledge to bring an ebuild up to Portage standards; widening up this scope to much more people is a bad idea since they will suggest hacks and approve ebuilds (we see this happen on #gentoo-dev-help, for instance) that aren't up to Portage standards. Furthermore, adding more non knowledgeable people will also just stall it more; see The Mythical Man-Month about this. But we don't actually need to refer to such thing: Has the Portage ever failed us in this way with the amount of ebuild requests we have after this many years? I think not. If we're going to increase the rate at which they enter the Portage tree without increasing the rate at which we recruit Gentoo developers, this will just introduce breakage in the Portage tree; better have a maintainable tree, than having to deal with all the not maintainable stuff that enters it. I don't see this as a problem. > That same limited set of people has an absolute authority over ebuild processing. ... However in order to alleviate previous problem and keep the same system structure it becomes necessary to involve more and more people into the core group and at some point it becomes unmanageable (as core developers have to be extensively "trained"). They are educated for this purpose, why not? In a similar vein, the council has an absolute authority to vote on matters as well, because they are the most eligible to know what is good for Gentoo and what not. This isn't really a problem statement, it broadly uses the terms "system structure" and "unmanagable" but doesn't go into details of it; and seeing how well the training went, I don't think there's too much effort to that. However, the next sentence does have a better point which points out "trained manpower" sounds like a bottleneck: > Strict centrally controlled structure works nice for small system but does not scale when system grows. But, for the large scale of users compared to Gentoo developers, I think the sunrise project already really solves this well. It serves as a mass training of users, ebuilds that are up to standards and can be merged into the Portage tree once they are reasonable; so, I don't see how this proposal is currently still addressing a current actual practical problem. Therefore, I suggest this proposal to be revised or this bug to be closed; if nobody has a reasonable objection, I'll ping the ML in two weeks from now about it for input and if there is again no reasonable objection we can then close the bug two weeks after that. For those who wish to revise it, please clearly state... 1. ... which current actual practical problem this proposal is addressing. 2. ... why the sunrise, ebuild janitor and other projects don't solve this. 3. ... a practical abstract idea that could possibly resolve this. Without a concrete proposal that doesn't reflect the current day, I don't see why we should keep this bug around for much longer... this mess of random ideas is probably obsolete now and, for pending stuff, it should be reported in one bug per issue and assigned to the people willing to do the work |