Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 361809

Summary: RFE: Split gentoo-x86/profiles/ChangeLog - it's too big
Product: Gentoo Infrastructure Reporter: Radosław Smogura <mail>
Component: GitAssignee: Gentoo Infrastructure <infra-bugs>
Status: RESOLVED OBSOLETE    
Severity: enhancement CC: mgorny, pacho, tools-portage, ulm
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---

Description Radosław Smogura 2011-04-03 17:35:24 UTC
Currently Changelog of profiles (profiles/ChangeLog) takes about 700kb and with every change it must be fully downloaded, maybe it's better to split for some date portions like 
Changelog - current
Changelog_till_20110101
Changelog_till_(date)
For me download takes 5 seconds.

Reproducible: Always



Expected Results:  
Faster synchronization with main tree.
Comment 1 Ulrich Müller gentoo-dev 2011-04-08 18:24:37 UTC
Some statitics:

   5111 entries in profiles/ChangeLog (as of today)
   3725 (73%) package.mask
    565 (11%) use.local.desc
    166 ( 3%) updates/*
    655 (13%) all others

use.local.desc is now generated from package metadata, and I wonder if the same could be done for package.mask? This would reduce the number of entries by a factor of 6.

I vaguely remember that someone had suggested this before, but I cannot find it any more.
Comment 2 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2012-01-12 18:28:45 UTC
Andreas did this on his own. I've asked him to post the script he used. Potentially, the infra team could run this @yearly. (or something)
Comment 3 Andreas K. Hüttel archtester gentoo-dev 2012-01-12 18:41:48 UTC
No script - manually... but it should not be too hard to add that to echangelog ...
Comment 4 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2012-01-12 18:49:53 UTC
(In reply to comment #3)
> No script - manually... but it should not be too hard to add that to echangelog
> ...

An angle that I didn't think of. @tools-portage: thoughts?
Comment 5 Christian Ruppert (idl0r) gentoo-dev 2012-01-13 19:58:36 UTC
Need some details then.
What shall be the max. size?
Do we just strip it or e.g. gzip ChangeLog?
Comment 6 Andreas K. Hüttel archtester gentoo-dev 2012-01-13 20:22:53 UTC
(In reply to comment #5)
> Need some details then.
> What shall be the max. size?

Result of the -dev ML discussion was, 50k or 100k. 

Since I did not care, did not want to wait for a tool to be implemented, and did not want to split hundreds of files, I went for 100k.

> Do we just strip it or e.g. gzip ChangeLog?

Well... disk space is cheap. Update / sync time not. 
I would vote for "only split".
Comment 7 Christian Ruppert (idl0r) gentoo-dev 2012-01-14 17:07:45 UTC
Ok, does splitting include compression?
How many splits are allowed?
What if we reach the max. amount of split changelogs?
Comment 8 Andreas K. Hüttel archtester gentoo-dev 2012-01-14 17:58:55 UTC
(In reply to comment #7)
> Ok, does splitting include compression?

Well, no, not per se.

> How many splits are allowed?

Result of the -dev discussion was "only possible split point is end of the year".

Conveniently this results for profiles/ChangeLog in chunks of order 150k.

> What if we reach the max. amount of split changelogs?

I guess even with mixixfs the singularity will come first. :P
Comment 9 Christian Ruppert (idl0r) gentoo-dev 2012-01-14 19:40:03 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Ok, does splitting include compression?
> 
> Well, no, not per se.
> 

So we should clarify that first.

> > How many splits are allowed?
> 
> Result of the -dev discussion was "only possible split point is end of the
> year".
> 

Why end of the year only?
What I actually meant is:
We'll have ChangeLog.1, .2, .3 etc. Where's the limit?

> Conveniently this results for profiles/ChangeLog in chunks of order 150k.
> 
> > What if we reach the max. amount of split changelogs?
> 
> I guess even with mixixfs the singularity will come first. :P

See above.
Comment 10 Ulrich Müller gentoo-dev 2012-01-14 20:12:15 UTC
(In reply to comment #9)
> > > Ok, does splitting include compression?
> > 
> > Well, no, not per se.
> 
> So we should clarify that first.

I thought that binary files should be avoided in a VCS?

> Why end of the year only?
> What I actually meant is:
> We'll have ChangeLog.1, .2, .3 etc. Where's the limit?

If we split only at the end of the year, then we can name the files ChangeLog-2011 etc. I think this would be convenient a) for locating of old entries and b) for removal of old ChangeLog files (in case we should decide on doing so). Alternatively, old ChangeLogs could be suppressed in CVS->rsync.

Currently profiles/ChangeLog has the largest growth rate of all ChangeLogs, about 150 kB/year. I don't see the need for splits more fine-grained than once per year.
Comment 11 Andreas K. Hüttel archtester gentoo-dev 2012-01-14 20:17:28 UTC
1)
http://archives.gentoo.org/gentoo-dev/msg_1bf04e7bc6fe0387702f5b5c1f23df10.xml

2)
http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo-x86/profiles/

I have this vague feeling that we're running around in circles. Thus, I'll leave the rest of the running in circles to you.
Comment 12 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2012-01-14 20:37:13 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > > > Ok, does splitting include compression?
> > > 
> > > Well, no, not per se.
> > 
> > So we should clarify that first.
> 
> I thought that binary files should be avoided in a VCS?

In CVS -- maybe. But in git there's no problem with them :P.
Comment 13 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2012-01-14 20:59:46 UTC
Please don't compress them. I'll exclude them in cvs->rsync conversion, and keep only the last split one (so this year we'll have ChangeLog and ChangeLog-2011, and the other ones will be excluded).
Split purely by year is fine.
Comment 14 Jeremy Olexa (darkside) (RETIRED) archtester gentoo-dev Security 2012-01-16 03:21:10 UTC
(In reply to comment #13)
> ChangeLog-2011, and the other ones will be excluded).

That is easy, but why? They are static files with a one-time download cost.
Comment 15 Radosław Smogura 2012-01-16 13:16:46 UTC
Just for furtherer safety. I think even if ChnageLog will be yearly spitted it should be named ChnageLog-20111231 (last date in file), this will add more flexibility and keep naming convention in cases when split policy will change to shorter periods, or size based.
Comment 16 Christian Ruppert (idl0r) gentoo-dev 2012-01-16 16:57:18 UTC
(In reply to comment #15)
> Just for furtherer safety. I think even if ChnageLog will be yearly spitted it
> should be named ChnageLog-20111231 (last date in file), this will add more
> flexibility and keep naming convention in cases when split policy will change
> to shorter periods, or size based.

Using the year only should be enough. Also parsing would be easier. I don't see any benefit from using month/day as well. I guess ChangeLog handling will change completely anyway, as soon as we have git (when it ever happens :))
Comment 17 Ulrich Müller gentoo-dev 2017-11-03 07:09:31 UTC
Closing. profiles/ChangeLog is gone since quite some time.