Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 593916 - Parallelism in compression/decompression.
Summary: Parallelism in compression/decompression.
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Enhancement/Feature Requests (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-16 01:23 UTC by Simon-Pierre Dubé
Modified: 2022-10-24 21:36 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon-Pierre Dubé 2016-09-16 01:23:44 UTC
We are in a era of multicore and multiprocessor. I thought it would be nice to give the ability to easily use alternatives like PBZIP2 with portage. Maybe not only for manual compression, but also use it to compress distfiles, stage3 and such, as it remain compatible with BZIP2, is usually slightly less than 0.2% bigger, but can speedup substantially the process of compression/decompression.
Comment 1 Brian Dolbec (RETIRED) gentoo-dev 2016-09-16 01:39:59 UTC
I already fully intend to add a small lib to portage that makes all those options fairly transparent and flexible.

See: https://github.com/dol-sen/pyDeComp

I am already using it in the catalyst master branch code.  I just need to make a few more changes for better BSD compatibility.
Comment 2 Zac Medico gentoo-dev 2016-09-16 01:40:47 UTC
You should be able to do it via PORTAGE_BZIP2_COMMAND (there's also PORTAGE_BUNZIP2_COMMAND is you want separate decompression options).
Comment 3 Simon-Pierre Dubé 2016-09-16 02:28:09 UTC
(In reply to Brian Dolbec from comment #1)
> I already fully intend to add a small lib to portage that makes all those
> options fairly transparent and flexible.
> 
> See: https://github.com/dol-sen/pyDeComp
> 
> I am already using it in the catalyst master branch code.  I just need to
> make a few more changes for better BSD compatibility.

This is awesome! I just installed catalyst because i am planning on building some stage3. I will install catalyst-9999 and try your lib this weekend! 



(In reply to Zac Medico from comment #2)
> You should be able to do it via PORTAGE_BZIP2_COMMAND (there's also
> PORTAGE_BUNZIP2_COMMAND is you want separate decompression options).

I am already using PORTAGE_BZIP2_COMMAND="pbzip2" for my binpkg and binhost. So i can fully and quickly use Gentoo on many small power devices and also update production VMs without having to recompile all the time. So i can use Gentoo everywhere. I just would like to see it more integrated into the distfiles and such.

If you guys are interested, you can watch a video i made of compilations with DistCC on 42 cores total. https://www.youtube.com/watch?v=mrdm457sedw 


Thanks for your answers!
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-10-18 00:17:39 UTC
commit d0fa3d28c5de6a0a34ac87de5e1a463adbe58405
Author: Sam James <sam@gentoo.org>
Date:   Thu Sep 29 07:37:19 2022 +0100

    binpkg: compress/decompress xz in parallel; compress zstd in parallel

    - As before, needs >= xz-5.3.3_alpha for parallel decompression.
    - zstd does not support parallel decompression.

    Closes: https://github.com/gentoo/portage/pull/918
    Signed-off-by: Sam James <sam@gentoo.org>

commit 04302e2d4d5fafd3ed2f2375473d6fe3a2a2faa8
Author: Sam James <sam@gentoo.org>
Date:   Thu Sep 29 07:37:19 2022 +0100

    bin: ecompress: zstd: pass -j from MAKEOPTS for parallel compression

    Signed-off-by: Sam James <sam@gentoo.org>

commit 9ae3ec1af0071354db3bf57bc5cdec963b056e77
Author: Sam James <sam@gentoo.org>
Date:   Thu Sep 29 07:37:19 2022 +0100

    bin: ecompress: xz: set -9, pass -j from MAKEOPTS for parallel compression

    - Pass -9 like we do for bzip2 & gzip.
    - Pass -j from MAKEOPTS for parallel compression.

    Signed-off-by: Sam James <sam@gentoo.org>

commit 48d107e5c1a103d59a053aebeefa9a5aac5c32ff
Author: Sam James <sam@gentoo.org>
Date:   Sat Sep 24 08:23:36 2022 +0100

    bin: pass -j from MAKEOPTS to xz for parallel decompression

    >= xz 5.3.3_alpha supports parallel decompression.

    Signed-off-by: Sam James <sam@gentoo.org>

With that, I think we're essentially done. I suppose we could add some automagic pigz + pbzip2/lbzip2 usage though.
Comment 5 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-10-18 00:18:52 UTC
Let's call this done, actually, given stage3s use (pi)xz, we use it for a bunch of distfiles now wherever possible, and Portage defaults to zstd for binpkg compression and will try to do that in parallel where possible.

Note that zstd does not support parallel decompression right now, unlike xz, but it's arguably fast enough that on most systems it shouldn't matter.
Comment 6 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-10-18 00:19:10 UTC
Note that I also made some eclass changes at the same time.