Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 2567 - Keeping a sizelimit on /usr/portage/distfiles
Summary: Keeping a sizelimit on /usr/portage/distfiles
Status: RESOLVED WONTFIX
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Enhancement/Feature Requests (show other bugs)
Hardware: x86 Linux
: High enhancement (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
: 9858 9957 12907 13560 13765 14273 29422 (view as bug list)
Depends on:
Blocks:
 
Reported: 2002-05-08 01:36 UTC by Daniel Ahlberg (RETIRED)
Modified: 2005-04-18 09:02 UTC (History)
14 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Ahlberg (RETIRED) gentoo-dev 2002-05-08 01:36:53 UTC
It would be nice to be able to manage the size of the /usr/portage/distfile 
directory. I was thinking of a variable setting in make.conf 
saying "DISTFILE_SPACE=500kb" and when portage is about to download it first 
checks the size of the distfile directory. If the directory is over the limit 
portage starts to remove the oldest file until the size of the directory is 
below DISTFILE_SPACE.
Comment 1 Paul de Vrieze (RETIRED) gentoo-dev 2002-05-08 03:40:43 UTC
500kb is VERY limited of course, but your thoughts are OK (as long as it is
possible to turn of the behaviour)
Comment 2 Arcady Genkin (RETIRED) gentoo-dev 2002-05-16 01:31:37 UTC
I'd say that Portage should not be concerned with this.  This sounds like a task
for a nightly cron job.

p.s.  My ${DEITY}, 500K of distfile space is not going to get you very far. ;^)
Comment 3 Daniel Ahlberg (RETIRED) gentoo-dev 2002-05-16 02:47:54 UTC
I created a small bash script that could be run from a crontab. Not too pretty, 
but it should work.

#!/bin/bash   

DISTFILES_SPACE="20gb"

## convert $DISTFILES_SPACE to bytes
DISTFILES_SUFFIX=`echo $DISTFILES_SPACE | sed 's/[0-9]//g'`
DISTFILES_PREFIX=`echo $DISTFILES_SPACE | sed 's/[a-z]//g'`

if [ $DISTFILES_SUFFIX == "m" ] || [ $DISTFILES_SUFFIX == "mb" ]
then
  DISTFILES_SPACE2=`echo "$DISTFILES_PREFIX  * 1024" | bc -l`
elif [ $DISTFILES_SUFFIX == "g" ] || [ $DISTFILES_SUFFIX == "gb" ]
then
  DISTFILES_SPACE2=`echo "$DISTFILES_PREFIX * 1024 * 1024" | bc -l`
elif [ $DISTFILES_SUFFIX == "b" ]
then
  DISTFILES_SPACE2=`echo "$DISTFILES_PREFIX / 1024" | bc -l`
fi

echo $DISTFILES_SPACE2

## current size of distfiles directory

CURRENT_SIZE=`du -k /usr/portage/distfiles | awk '{print $1}'`

if [ $CURRENT_SIZE -gt $DISTFILES_SPACE2 ]
then
  ## make an array of files in /usr/portage/distfiles
  filename=( `ls -1tr /usr/portage/distfiles` )
  file_size=( `ls -ltr /usr/portage/distfiles | awk '{print $5"/1024"}' | bc -l 
| sed 's/\.[0-9]*$//g'` )

  I=0
  until [ $CURRENT_SIZE -le $DISTFILES_SPACE2 ]
  do
    let CURRENT_SIZE=CURRENT_SIZE-${file_size[$I]}
    let I=I+1
  done
fi
Comment 4 Daniel Ahlberg (RETIRED) gentoo-dev 2002-05-16 02:51:36 UTC
Might work even better if it acutally deleted files too :-(

add "rm -f /usr/portage/${filename[0]}" after the "do" and before the two "let"
Comment 5 Daniel Ahlberg (RETIRED) gentoo-dev 2002-05-16 02:59:57 UTC
I must stop coding this early in the morning, change ${filename[0]} to 
${filename[$I]}
Comment 6 Marcel Kunath 2002-07-13 19:56:37 UTC
Keeping a sizelimit?

I can see the reasons behind this since disks are limited in size but I figure
giving people a choice in distfiles mount points is a better solution to beating
the size problem. Read bug report 4950.

http://bugs.gentoo.org/show_bug.cgi?id=4950
Comment 7 mark newman 2002-07-27 05:21:30 UTC
I have recently mentioned something similar on the gentoo-user list.  I was suggesting an option on emerge to clean the distfiles of old files.  Many people felt strongly this should be only an option.  It has also been suggested to me that the old files could be moved to a different directory so that for example they could be burnt to cd.  So how about all old files (checks similar to emeerge clean would be needed) to be copied to /usr/portage/distfiles.old.  The user could then either delete that directory when required or move it elsewhere for safe keeping.  This could either be done on the comand line as I had thought or tied to the size limit as suggested above. 
Comment 8 SpanKY gentoo-dev 2002-10-30 15:33:28 UTC
*** Bug 9957 has been marked as a duplicate of this bug. ***
Comment 9 Martin Holzer (RETIRED) gentoo-dev 2002-12-29 14:31:18 UTC
*** Bug 12907 has been marked as a duplicate of this bug. ***
Comment 10 SpanKY gentoo-dev 2003-01-08 13:40:31 UTC
*** Bug 9858 has been marked as a duplicate of this bug. ***
Comment 11 Sean P. Kane 2003-01-08 13:56:42 UTC
Read some of the descriptions in the "duplicate" bugs listed in this thread to 
get an idea of the multiple ways in which this unrestricted directory growth is 
a serious issue and why a cron job is simply not good enough. <a href=
"http://bugs.gentoo.org/show_bug.cgi?id=9858">Bug 9858</a> is a good example of 
a time when cron would not work.

Sean
Comment 12 Marcel Kunath 2003-01-08 15:25:22 UTC
That is why I always thought emerge should be adjustable to use multiple
distfiles directories and configurable in a way to allow NFS and Samba file shares

/usr/portage/distfiles-local
/usr/portage/distfiles-samba
/usr/portage/distfiles-nfs

Then people could adjust around the space problem. But people hated my idea.

Marcel
Comment 13 Martin Holzer (RETIRED) gentoo-dev 2003-01-09 10:26:55 UTC
*** Bug 13560 has been marked as a duplicate of this bug. ***
Comment 14 Sean P. Kane 2003-01-09 12:38:09 UTC
Just being able to point to network drives isn't a good solution either since 
many people may not have network drives avaliable to them. I still ean towards 
a solution like I mentioned in Bug 9858 :

"There should be a way to tell emerge to emtpy these directories ( --cleandist
--cleantmp ) after it finishes with each package. It seems to only do this at 
the end of the whole emerge command currently. This way it would emerge bash, 
clean those files, and then emerge X11, clean those files, etc., etc."

Refering to both /usr/portage/distfiles and /var/tmp/portage.

Sean
Comment 15 Marcel Kunath 2003-01-09 15:21:56 UTC
I didn't say emerge didn't need a --clean option.

A --clean option is needed for people with limited disk space and single computers.

But people with networks and larger diskspace available and smart networks and
people who handle networks with 5+ Gentoo machines need to have nfs and samba
distfiles capabilities.

The reason why I am so vehement on this subject is to prevent Gentoo from
setting --clean as a DEFAULT. It would make people who like to keep their
source, because they pay for every MB with their ISP, a lot more trouble. It's a
lot cheaper for me to buy an extra disk than it is to pay for downloading the
whole stuff 2 or 3 times. 

Marcel
Comment 16 Martin Holzer (RETIRED) gentoo-dev 2003-01-12 06:31:38 UTC
*** Bug 13765 has been marked as a duplicate of this bug. ***
Comment 17 SpanKY gentoo-dev 2003-01-22 10:02:47 UTC
*** Bug 14273 has been marked as a duplicate of this bug. ***
Comment 18 Duke 2003-01-22 22:32:11 UTC
In Bug 14273, I had suggested that a distfile should at least be cleaned out
when someone runs emerge clean.

For example, if I were upgrading a package, and cleaning out the old version,
the old distfile should be deleted as well.  Keep in mind that this wouldn't
apply to packages with an "-rX" tagged on the end (like foo-1.2.13-r6), because
r1 through r6 would all use the distfile foo-1.2.13.tar.gz.  However, if I did
an emerge update on foo, and it went from foo-1.2.13 to foo-1.3.12, then
foo-1.2.13.tar.gz would be deleted, as it has been replaced with
foo-1.3.12.tar.gz, the latest tarball of package foo.
The downfall to this scheme is when a gentoo package maintainer decides that a
package needs to be downgraded, due to bugs or otherwise.  In this case, a user
could specify to keep X latest versions of the packages in distfiles.  So set
SRC_HISTORY=2, and when emerge cleans, it would keep foo-1.2.13.tar.gz and
foo-1.3.12.tar.gz - the two latest tarballs of package foo.

In my distfiles dir, there are more packages than I care to count with 5 or more
versions sitting there.  I think it's safe to assume a package won't be
downgrading by 5 versions.  There really aught to be some way of keeping excess
version tarballs in check.

This would be a very effective scheme in many situations.  It'd be great for the
situation described in comment #15, for example.
Comment 19 Marius Mauch (RETIRED) gentoo-dev 2003-09-21 06:29:45 UTC
IMO this is a job that should be tackled with FETCH_COMMAND / RESUME_COMMAND:
- check size of DISTDIR
- if it is above a given limit either delete old files or fail (depending on user preference)
- start the normal fetch process after that
Comment 20 Marius Mauch (RETIRED) gentoo-dev 2003-09-23 07:20:55 UTC
*** Bug 29422 has been marked as a duplicate of this bug. ***
Comment 21 Duke 2003-09-23 16:48:13 UTC
In addition to my comment #18, when "emerge sync" is run, and portage is updated, are there not ebuilds from old versions of packages that get cleaned out?  When this happens, emerge should check for that version's tarball in distfiles.  If the  ebuild is being removed, then there's certainly no need for the tarball.
Comment 22 Olav Kolbu 2003-09-25 01:28:42 UTC
Bugger, just missed the 500 day anniversary for this bug. Og well, I'm sure we'll get another chance at 1000. Anyway, I'm with comment #19. Two simple constants telling the system how much space you want to set aside (0 could mean no restrictions) and what to do at limit (halt emerge or remove oldest until below limit or empty). No system should be allowed to eat diskspace without having some scheme for capping growth, that much is pretty obvious if you've ran a few systems before.

Btw, the 'remove the binary when removing the ebuild' theory won't work, there are plenty of ebuilds that reuse the binary. And not just the -r* ones. Most of the kernel-packages consist of the base kernel then patches for example.
Comment 23 Nicholas Jones (RETIRED) gentoo-dev 2004-05-18 07:05:43 UTC
I'd prefer this be a script that could possibly hook into
portage at some point rather than adding this logic.

Tools can take this if they want.
Comment 24 Richard Fujimoto 2005-04-18 09:02:19 UTC
Why work with a size limit which then forces portage to "choose" which distfile to remove.  What I propose is that each package look after everything, including their own distfiles.  Distfiles aren't recorded when they're downloaded from what I can see.

One solution is to create a FEATURE called deldistfile or something like that which acts as a switch for this feature:

Secondly, an emerge would write the distfiles to a file in /var/db/pkg/foo-bar/pkgname/CONTENTS with the prefix DISTFILE, so an example would be DISTFILE-kdebase-3.4.0.tar.bz2 (obviously the distfile names are gotten from SRC_URI).

Then, using the same process that `qpkg -f /path/to/file` uses, determine whether or not the distfile (in the example: DISTFILE-kdebase-3.4.0.tar.bz2) is used by any other installed package and if not, delete the distfile.

So that way, if you have both a ck-sources-2.6.99 and gentoo-sources-2.6.99, only removing the ck-sources will still keep the vanilla kernel tarball (as distributed by kernel.org) but will remove the ck-patchset tarball.  Removing both would remove the vanilla kernel tarball and both patchset tarballs.

The benefit of this is that portage isn't selectively choosing tarballs to delete in an effort to get down to size.  Portage is aware of distfiles and when they're no longer used by a package after `emerge -C [pkg]` they're removed.  One quick look in /usr/portage/distfiles for me shows a bunch of kde[...]-3.3* tarballs.... I don't have kde 3.3