Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 142579 - portage binary packages, option to use gzip or xz instead of bzip2
Summary: portage binary packages, option to use gzip or xz instead of bzip2
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Binary packages support (show other bugs)
Hardware: All Linux
: High enhancement with 2 votes (vote)
Assignee: Portage team
URL: http://archives.gentoo.org/gentoo-por...
Whiteboard:
Keywords: InVCS
: 142581 166027 292302 380925 (view as bug list)
Depends on: 150031
Blocks: 627566
  Show dependency tree
 
Reported: 2006-08-02 13:35 UTC by Ross Capdeville
Modified: 2018-02-01 23:47 UTC (History)
18 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Patch to support compression option gzip in portage 2.1.2.2 (142579.patch,19.85 KB, patch)
2007-07-24 20:42 UTC, Ross Capdeville
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ross Capdeville 2006-08-02 13:35:31 UTC
I am writing an automated gentoo deployment system which will be used to deploy 100s of hosts. I have a seed machine which builds binary packages. The packages are transferred to the portage directory on the new host in the build process. Emerge is correctly using the packages on the local machine. 

The problem is: the installation process takes twice as long as it should. We are currently using Debian in our production environment, which installs in 5 1/2 minutes. The gentoo installation at first took over 3 times as long. I gzipped the stage3 and portage tarballs instead of bzip2, and with the local package cache, I was able to reduce the installation time to 10 minutes. This is still way too long. 

I suspect that the time it takes to install the approximately 300 MB of bzipped packages is causing the delay. BZIP2 is known to be 6-10 times slower than gzip. I would like to be able to create and use packages compressed with gzip. 

I figured there would be an easy way to do this in the portage python source. I found that the xpak library handles appending gentoo specific data to the end of the archive. I figured there would be some variable controlling the program used to compress/decompress the archive. However, it seems bzip2 and the file extension is hardcoded in multiple places. 

What was the reasoning behind this decision to hardcode the compression utility? Is there an easy way to change it? Is there a more complicated way to change it? My company is possibly willing to do the programming if we had some guidance. Do you think this could be useful to others? 

Ross Capdeville 
Sr. Systems Engineer
Comment 1 Zac Medico gentoo-dev 2006-08-02 14:08:42 UTC
*** Bug 142581 has been marked as a duplicate of this bug. ***
Comment 2 Zac Medico gentoo-dev 2006-08-02 14:36:37 UTC
(In reply to comment #0)
> What was the reasoning behind this decision to hardcode the compression
> utility?

It's probably just shortsightedness.

> Is there an easy way to change it? Is there a more complicated way to
> change it? My company is possibly willing to do the programming if we had some
> guidance. Do you think this could be useful to others? 

There probably aren't very many code changes required to fix this.  We'd only have to fix any compression/decompressin code as well as code that relies on the tbz2 extension .
Comment 3 Ross Capdeville 2006-08-02 14:50:14 UTC
Lets start with the files that would need to be changed. 

First, the mechanism to create packages should be able to use gzip or bzip2. 

Would you want to feed this off an environment varaible, defaulting to bzip2? If not how would we know what format to make/fetch packages?

bin/misc-functions.sh          Function dyn_package()
pym/getbinpkg.py               Hardcoded .tbz2
pym/portage.py                 Hardcoded .tbz2, also spawns bzip2

xpak.py - may magically work after the above items are changed? 
Comment 4 Zac Medico gentoo-dev 2006-08-03 11:34:06 UTC
(In reply to comment #3)
> xpak.py - may magically work after the above items are changed? 

Yes, xpak does not need to be concerned with the compression because the xpak segment is appended to the tail of the file and is uncompressed.

Should portage only look for the extension corresponding the the currently selected compression type? That way it won't have to go scanning for both tbz2 and tgz at the same time (potentially finding two versions of the same package).  We can create a tool that converts packages from one type to another.

Comment 5 SpanKY gentoo-dev 2006-11-11 01:37:38 UTC
do we want to allow people to select different compression types for everything or just have 1 (PORTAGE_COMPRESSION) ?  this would affect documentation as well ...
Comment 6 David Pyke 2007-04-30 01:24:18 UTC
*** Bug 166027 has been marked as a duplicate of this bug. ***
Comment 7 Ross Capdeville 2007-07-24 20:42:40 UTC
Created attachment 125913 [details, diff]
Patch to support compression option gzip in portage 2.1.2.2

This patch applied to the root directory tree of a gentoo installation patches portage to accept the PORTAGE_COMPRESS="gzip" option for packages. if this option is set, it creates packages compressed with gzip and also installs packages compressed with gzip. Apply patch in your / directory as root against portage 2.1.2.2 with the command:

patch -p1 < 142579.patch
Comment 8 SpanKY gentoo-dev 2008-01-18 00:52:34 UTC
erp, that doesnt really scale (and it has some whitespace damage).  what about the next guy who wants to use a different compression scheme ?  in fact, going by your original demands (pure decompression speed), i think using lzma would make installation times even faster.

some tips:
 - abstract the "binary package extension" into a function so everything can be centralized there
 - instead of parsing ${PORTAGE_COMPRESS} directly to figure out the extension, just use `ecompress --suffix`
 - instead of adding custom decompression routines, reuse the unpack() function

i'm pretty sure with these three fundamental changes, you would basically get complete flexibility.  want .tlzma ?  no problem, merely set:
PORTAGE_COMPRESS="lzma"
want .tbz2 ?  no problem, merely set:
PORTAGE_COMPRESS="bzip2"
want .tgz ?  no problem, merely set:
PORTAGE_COMPRESS="gzip"

i think you get the idea :)
Comment 9 Paul Freeman 2010-10-13 10:48:34 UTC
has anyone made any further progress on this?
Comment 10 Zac Medico gentoo-dev 2010-10-13 15:13:46 UTC
A few months ago we had a discussion about ways to handle it in terms of PORTAGE_BINHOST support:

http://archives.gentoo.org/gentoo-portage-dev/msg_6ead086db61f438bfbac01c97d3da390.xml
Comment 11 Daniel Bumke 2012-06-15 09:00:44 UTC
Has there been any movement on this recently? It would be great to have the option to build .gz or .xz packages even just for local use (bzip2 is really slow and xz -2 or xz -3 outperforms it on compression time, compressed size, AND decompression time iirc).
Comment 12 Zac Medico gentoo-dev 2012-10-22 01:45:00 UTC
*** Bug 292302 has been marked as a duplicate of this bug. ***
Comment 13 Zac Medico gentoo-dev 2015-01-16 01:32:36 UTC
I have a forward-compatibility patch for gzip and xz compression in the following branch:

	https://github.com/zmedico/portage/tree/binpkg-multi-decompression

I've posted it for review here:

	http://thread.gmane.org/gmane.linux.gentoo.portage.devel/5074
Comment 14 Zac Medico gentoo-dev 2015-01-17 08:06:36 UTC
(In reply to Zac Medico from comment #13)
> I have a forward-compatibility patch for gzip and xz compression in the
> following branch:
> 
> 	https://github.com/zmedico/portage/tree/binpkg-multi-decompression

Decompression support is in the master branch now:

https://github.com/gentoo/portage/commit/6b3d262e6316073a2a3be81086c05891d970ae2a

I'm planning to add support for compression selection after support for a compression-independent file layout has been merged (see bug #150031).
Comment 15 Zac Medico gentoo-dev 2015-06-28 23:30:12 UTC
All that's left to do is to provide a way to select the compression type. Since portage-2.2.16, portage probes the compression type and maps that to a decompressor.
Comment 16 marco 2016-03-07 11:16:23 UTC
(In reply to Zac Medico from comment #15)
> All that's left to do is to provide a way to select the compression type.
> Since portage-2.2.16, portage probes the compression type and maps that to a
> decompressor.

There is any option to enable creating binpkg with  gzip or lzma ?
Comment 17 Zac Medico gentoo-dev 2017-02-07 22:57:15 UTC
*** Bug 380925 has been marked as a duplicate of this bug. ***
Comment 18 marco 2017-05-02 07:41:44 UTC
Hi,
any update regarding how to select the compression type of binpkgs ?

I'm missing how to do that or it is a feature still in development?
Comment 19 Zac Medico gentoo-dev 2017-05-02 16:43:35 UTC
It's not implemented yet, but it's high on my todo list.
Comment 20 Zac Medico gentoo-dev 2017-07-30 23:08:51 UTC
In the master branch, we now have support to configure binary package
compression:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=cff2c0149142843316e1851c2e73bcec30f08471

The relevant variables are now called BINPKG_COMPRESS AND BINPKG_COMPRESS_FLAGS:

https://gitweb.gentoo.org/proj/portage.git/commit/?id=2c7d38b9512609f6828cbba1066f2b3b2d9144bf