| Summary: | RFE: Split gentoo-x86/profiles/ChangeLog - it's too big | ||
|---|---|---|---|
| Product: | Gentoo Infrastructure | Reporter: | Radosław Smogura <mail> |
| Component: | Git | Assignee: | Gentoo Infrastructure <infra-bugs> |
| Status: | RESOLVED OBSOLETE | ||
| Severity: | enhancement | CC: | mgorny, pacho, tools-portage, ulm |
| Priority: | Normal | ||
| Version: | unspecified | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
|
Description
Radosław Smogura
2011-04-03 17:35:24 UTC
Some statitics:
5111 entries in profiles/ChangeLog (as of today)
3725 (73%) package.mask
565 (11%) use.local.desc
166 ( 3%) updates/*
655 (13%) all others
use.local.desc is now generated from package metadata, and I wonder if the same could be done for package.mask? This would reduce the number of entries by a factor of 6.
I vaguely remember that someone had suggested this before, but I cannot find it any more.
Andreas did this on his own. I've asked him to post the script he used. Potentially, the infra team could run this @yearly. (or something) No script - manually... but it should not be too hard to add that to echangelog ... (In reply to comment #3) > No script - manually... but it should not be too hard to add that to echangelog > ... An angle that I didn't think of. @tools-portage: thoughts? Need some details then. What shall be the max. size? Do we just strip it or e.g. gzip ChangeLog? (In reply to comment #5) > Need some details then. > What shall be the max. size? Result of the -dev ML discussion was, 50k or 100k. Since I did not care, did not want to wait for a tool to be implemented, and did not want to split hundreds of files, I went for 100k. > Do we just strip it or e.g. gzip ChangeLog? Well... disk space is cheap. Update / sync time not. I would vote for "only split". Ok, does splitting include compression? How many splits are allowed? What if we reach the max. amount of split changelogs? (In reply to comment #7) > Ok, does splitting include compression? Well, no, not per se. > How many splits are allowed? Result of the -dev discussion was "only possible split point is end of the year". Conveniently this results for profiles/ChangeLog in chunks of order 150k. > What if we reach the max. amount of split changelogs? I guess even with mixixfs the singularity will come first. :P (In reply to comment #8) > (In reply to comment #7) > > Ok, does splitting include compression? > > Well, no, not per se. > So we should clarify that first. > > How many splits are allowed? > > Result of the -dev discussion was "only possible split point is end of the > year". > Why end of the year only? What I actually meant is: We'll have ChangeLog.1, .2, .3 etc. Where's the limit? > Conveniently this results for profiles/ChangeLog in chunks of order 150k. > > > What if we reach the max. amount of split changelogs? > > I guess even with mixixfs the singularity will come first. :P See above. (In reply to comment #9) > > > Ok, does splitting include compression? > > > > Well, no, not per se. > > So we should clarify that first. I thought that binary files should be avoided in a VCS? > Why end of the year only? > What I actually meant is: > We'll have ChangeLog.1, .2, .3 etc. Where's the limit? If we split only at the end of the year, then we can name the files ChangeLog-2011 etc. I think this would be convenient a) for locating of old entries and b) for removal of old ChangeLog files (in case we should decide on doing so). Alternatively, old ChangeLogs could be suppressed in CVS->rsync. Currently profiles/ChangeLog has the largest growth rate of all ChangeLogs, about 150 kB/year. I don't see the need for splits more fine-grained than once per year. 1) http://archives.gentoo.org/gentoo-dev/msg_1bf04e7bc6fe0387702f5b5c1f23df10.xml 2) http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/profiles/ I have this vague feeling that we're running around in circles. Thus, I'll leave the rest of the running in circles to you. (In reply to comment #10) > (In reply to comment #9) > > > > Ok, does splitting include compression? > > > > > > Well, no, not per se. > > > > So we should clarify that first. > > I thought that binary files should be avoided in a VCS? In CVS -- maybe. But in git there's no problem with them :P. Please don't compress them. I'll exclude them in cvs->rsync conversion, and keep only the last split one (so this year we'll have ChangeLog and ChangeLog-2011, and the other ones will be excluded). Split purely by year is fine. (In reply to comment #13) > ChangeLog-2011, and the other ones will be excluded). That is easy, but why? They are static files with a one-time download cost. Just for furtherer safety. I think even if ChnageLog will be yearly spitted it should be named ChnageLog-20111231 (last date in file), this will add more flexibility and keep naming convention in cases when split policy will change to shorter periods, or size based. (In reply to comment #15) > Just for furtherer safety. I think even if ChnageLog will be yearly spitted it > should be named ChnageLog-20111231 (last date in file), this will add more > flexibility and keep naming convention in cases when split policy will change > to shorter periods, or size based. Using the year only should be enough. Also parsing would be easier. I don't see any benefit from using month/day as well. I guess ChangeLog handling will change completely anyway, as soon as we have git (when it ever happens :)) Closing. profiles/ChangeLog is gone since quite some time. |