Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 784710 - Remove SHA512 hash from Manifests
Summary: Remove SHA512 hash from Manifests
Status: RESOLVED WONTFIX
Alias: None
Product: Gentoo Council
Classification: Unclassified
Component: unspecified (show other bugs)
Hardware: All Linux
: Normal normal
Assignee: Gentoo Council
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-04-21 06:55 UTC by Michał Górny
Modified: 2021-08-08 19:07 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-04-21 06:55:20 UTC
# the current set went live on 2017-11-21, per 2017-11-12 Council meeting
# https://archives.gentoo.org/gentoo-dev/message/ba2e5d9666ebd7e1bff1143485a37856
manifest-hashes = BLAKE2B SHA512


3.5 yr later, it might be reasonable to finally remove SHA512.  I don't have any strong arguments for or against it, just seeing it as a cleanup move.
Comment 1 Thomas Deutschmann (RETIRED) gentoo-dev 2021-04-21 11:42:54 UTC
I strongly object here: The idea of listening multiple hashes will protect us against the event that there will be a problem with one hash type so we should always keep at least two competing hashes.
Comment 2 Ulrich Müller gentoo-dev 2021-04-21 11:54:48 UTC
Could we phase out SHA512 and add KECCAK instead, or is it too slow?
Comment 3 Ulrich Müller gentoo-dev 2021-04-23 08:18:53 UTC
(In reply to Ulrich Müller from comment #2)
> Could we phase out SHA512 and add KECCAK instead, or is it too slow?

Very primitive benchmark (on i7-8550U @ 1.80GHz):

$ time for (( i=0; i<100; i++ )); do b2sum /var/cache/distfiles/emacs-27.2.tar.xz >/dev/null; done
real    0m5.879s
user    0m5.203s
sys     0m0.674s

$ time for (( i=0; i<100; i++ )); do sha512sum /var/cache/distfiles/emacs-27.2.tar.xz >/dev/null; done
real    0m10.894s
user    0m10.170s
sys     0m0.723s

$ time for (( i=0; i<100; i++ )); do openssl dgst -blake2b512 /var/cache/distfiles/emacs-27.2.tar.xz >/dev/null; done
real    0m6.217s
user    0m5.450s
sys     0m0.766s

$ time for (( i=0; i<100; i++ )); do openssl dgst -sha512 /var/cache/distfiles/emacs-27.2.tar.xz >/dev/null; done
real    0m7.761s
user    0m6.933s
sys     0m0.828s

$ time for (( i=0; i<100; i++ )); do openssl dgst -sha3-512 /var/cache/distfiles/emacs-27.2.tar.xz >/dev/null; done
real    0m24.336s
user    0m23.491s
sys     0m0.846s

Taking the numbers from openssl, sha3 (keccak) is slower by a factor of 3 and 4 compared to sha2 and blake2, respectively.

Interestingly, openssl is some 40% faster than coreutils for sha2.
Comment 4 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-04-23 08:46:14 UTC
(In reply to Thomas Deutschmann from comment #1)
> I strongly object here: The idea of listening multiple hashes will protect
> us against the event that there will be a problem with one hash type so we
> should always keep at least two competing hashes.

IIRC the last discussion ended up with the conclusion that there's no real value from having multiple hashes, and using them like this in the past was pretty much cargo cult.  I think we're better than that.
Comment 5 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-04-23 08:49:01 UTC
(In reply to Ulrich Müller from comment #3)
> Taking the numbers from openssl, sha3 (keccak) is slower by a factor of 3
> and 4 compared to sha2 and blake2, respectively.
> 
> Interestingly, openssl is some 40% faster than coreutils for sha2.

I think what matters in practice is the Python implementation that Portage/pkgcore uses.  gemato includes a primitive benchmarking tool.

On Ryzen 5 3600, on tmpfs:

$ /home/mgorny/git/gemato/utils/benchmark.py chromium-91.0.4472.19.tar.xz blake2b sha512 sha3_512
['blake2b'] -> [1.4790698910001083, 1.4788789770009316, 1.4786044790016604, 1.4795997879991774, 1.480962300000101]
['sha512'] -> [1.6823269540000183, 1.683155593000265, 1.6835432360003324, 1.6823211499995523, 1.6972763199992187]
['sha3_512'] -> [5.392342687000564, 5.385123105001185, 5.393110663000698, 5.4156869710004685, 5.429470488999868]


That said, I think Python's blake2b impl is probably suffering from some performance problems (again?) ;-/.
Comment 6 Ulrich Müller gentoo-dev 2021-04-23 08:58:12 UTC
(In reply to Michał Górny from comment #5)
> (In reply to Ulrich Müller from comment #3)
> > Taking the numbers from openssl, sha3 (keccak) is slower by a factor of 3
> > and 4 compared to sha2 and blake2, respectively.
> > 
> > Interestingly, openssl is some 40% faster than coreutils for sha2.
> 
> I think what matters in practice is the Python implementation that
> Portage/pkgcore uses.  gemato includes a primitive benchmarking tool.
> 
> On Ryzen 5 3600, on tmpfs:
> 
> $ /home/mgorny/git/gemato/utils/benchmark.py chromium-91.0.4472.19.tar.xz
> blake2b sha512 sha3_512
> ['blake2b'] -> [1.4790698910001083, 1.4788789770009316, 1.4786044790016604,
> 1.4795997879991774, 1.480962300000101]
> ['sha512'] -> [1.6823269540000183, 1.683155593000265, 1.6835432360003324,
> 1.6823211499995523, 1.6972763199992187]
> ['sha3_512'] -> [5.392342687000564, 5.385123105001185, 5.393110663000698,
> 5.4156869710004685, 5.429470488999868]

That confirms (roughly) the factors from comment #3. sha3 is very slow.

> That said, I think Python's blake2b impl is probably suffering from some
> performance problems (again?) ;-/.

Needs to be rewritten in Rust, I guess. :p
Comment 7 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-04-23 09:11:16 UTC
According to [1], BLAKE2b should be around 30% faster than SHA2-512.  Ofc, it's possible that SHA2 became faster since then, or maybe sha_ni CPU extensions are helping here.  Still, I'll take a few minutes to investigate if Python's choosing the most optimal variant for me.

[1] https://www.blake2.net/
Comment 8 Ulrich Müller gentoo-dev 2021-04-23 09:20:34 UTC
(In reply to Michał Górny from comment #7)
> [...] or maybe sha_ni CPU extensions are helping here.

No sha_ni here (Kaby Lake processor), but I can confirm that it's less than 30% difference:

$ utils/benchmark.py /var/cache/distfiles/emacs-27.2.tar.xz blake2b sha512 sha3_512
['blake2b'] -> [0.05584478902164847, 0.05420189001597464, 0.054973421967588365, 0.05634020798606798, 0.05651942198164761]
['sha512'] -> [0.0730064709787257, 0.07224606099771336, 0.07204092695610598, 0.07199707702966407, 0.07207291800295934]
['sha3_512'] -> [0.2509028369677253, 0.2516804229817353, 0.2471757759922184, 0.2528015899588354, 0.25678251701174304]
Comment 9 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-04-23 09:27:41 UTC
Curious enough, I get expected results when system is under load (niced):

$ /home/mgorny/git/gemato/utils/benchmark.py /tmp/dist/chromium-91.0.4472.19.tar.xz blake2b sha512
['blake2b'] -> [ 1.679 1.652 1.688 1.703 1.656 ] -> min: 1.652
['sha512'] -> [ 2.687 2.748 2.741 2.78 2.67 ] -> min: 2.67

Without load:

$ /home/mgorny/git/gemato/utils/benchmark.py /tmp/dist/chromium-91.0.4472.19.tar.xz blake2b sha512
['blake2b'] -> [ 1.475 1.475 1.482 1.478 1.475 ] -> min: 1.475
['sha512'] -> [ 1.677 1.677 1.676 1.676 1.676 ] -> min: 1.676
Comment 10 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-04-23 09:38:41 UTC
Ok, there doesn't seem to be anything wrong inside Python.  I've tested all the variants and:

- AVX and SSE4.1 variants are the fastest (for some reason SSE4.1 seems to be a tiny bit faster than AVX but that might be measurement error)

- SSSE3 is slightly slower

- reference (C) implementation is slightly slower than SSSE3

- SSE2 is awfully slow (as expected, Python is disabling it entirely)
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2021-05-03 05:48:42 UTC
The discussion to date seems to revolve around performance, not functionality.

Something that was already partially lost in dropping earlier digests was the ability to trivially compare our digests vs upstream digests: dropping SHA512 will further limit that, because extremely few upstreams ship BLAKE2 signatures today, let alone SHA-3/Keccak.

Even the latest OpenSSL announcements are only SHA1 + SHA256
https://mta.openssl.org/pipermail/openssl-announce/2021-April/000200.html

Similarly OpenSSH (with the interesting tweak that they use base64 SHA256 rather than hex).

What external functionality are we likely to lose by removing the SHA512 hash?
I'm not so worried about the Manifest sizes in this, and more that being able to quickly ascertain that the hash in Gentoo is the same hash announced by upstreams.
Comment 12 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2021-08-08 19:07:55 UTC
Closing per negative ml feedback.