Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 638428 - sys-apps/portage-2.3.16 segfaults due to dev-python/pyblake2-1.0.0
Summary: sys-apps/portage-2.3.16 segfaults due to dev-python/pyblake2-1.0.0
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: Normal major (vote)
Assignee: Michał Górny
URL:
Whiteboard:
Keywords:
: 638978 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-11-22 06:56 UTC by jy6x2b32pie9
Modified: 2017-12-11 23:59 UTC (History)
10 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info (emerge--info.txt,5.87 KB, text/plain)
2017-11-22 06:56 UTC, jy6x2b32pie9
Details
build.log (with clang settings) (build.log,24.68 KB, text/x-log)
2017-11-24 11:41 UTC, M. B.
Details
build.log (with gcc settings) (build.log,25.12 KB, text/x-log)
2017-11-24 11:46 UTC, M. B.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jy6x2b32pie9 2017-11-22 06:56:44 UTC
Created attachment 505600 [details]
emerge --info

Relevant errors:

zsh: segmentation fault  emerge -1 portage
traps: emerge[17903] general protection ip:7f73c81d4ec6 sp:7ffcab4f2668 error:0 in pyblake2.cpython-35m-x86_64-linux-gnu.so[7f73c81d0000+8000]

Portage works when .so file manually removed.
Comment 1 jy6x2b32pie9 2017-11-22 07:16:12 UTC
On second thought, this may be due gcc-7.2, will test later.
Comment 2 Dan Goodliffe 2017-11-22 08:17:28 UTC
I can confirm I've been getting this problem... but only on some of my boxes.

[18855.000330] traps: emerge[6060] general protection ip:7f1a52d4215f sp:7ffcd7169900 error:0
[18855.000333]  in pyblake2.cpython-34m.so[7f1a52d3d000+8000]

For me at least... the following *appears* to "fix" the problem.

mv /usr/lib64/python3.4/site-packages/pyblake2.cpython-34m.so /var/tmp/
emerge -1a pyblake2
Comment 3 Dan Goodliffe 2017-11-22 08:30:58 UTC
Apologies, minor correction... got over excited... that only fixed one of my installs.
Removing pyblake2.cpython-34m.so certainly makes things work, but reinstalling brought the problem back again on one box, but not the others.
Comment 4 Tomáš Cícha 2017-11-22 08:47:27 UTC
Any chance you have -O3 in CFLAGS as I do? :)

I have a Skylake CPU running ~amd64 system with gcc 7.2

Compiling dev-python/blake2 with -O2 solved the problem and it no longer segfaults.
Comment 5 jy6x2b32pie9 2017-11-22 08:58:11 UTC
CFLAGS="-march=native -O2 -ftree-vectorize -pipe", Kaby Lake

Rebuild does nothing to help. Rebuild with clang-5 does nothing to help.

So, at least it's not gcc-7.2
Comment 6 jy6x2b32pie9 2017-11-22 09:01:34 UTC
It appears that -ftree-vectorize is to blame. Without it, pyblake2 work
Comment 7 Nikos Chantziaras 2017-11-22 11:23:57 UTC
I'm on portage 2.3.15 and I have the same issue. I can't upgrade to 2.3.16 or downgrade to 2.3.14:

# emerge -1 portage
>>> Emerging (1 of 1) sys-apps/portage-2.3.16::gentoo
Segmentation fault

$ dmesg

[ 5964.956736] traps: emerge[15270] general protection ip:7f356f93de4c sp:7ffd2c8d46c8 error:0
[ 5964.956740]  in pyblake2.cpython-34m.so[7f356f939000+8000]
Comment 8 Nikos Chantziaras 2017-11-22 11:29:38 UTC
Yep, rebuilding dev-python/pyblake2 without -ftree-vectorize fixes it.
Comment 9 johannes.walcher 2017-11-22 12:37:26 UTC
I fixed it by building dev-python/pyblake2 with -O2 instead of -O3
Comment 10 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-11-22 12:41:03 UTC
I'd appreciate some help debugging it.

For a start, does the following segv for you:

$ git clone https://github.com/dchest/pyblake2
$ cd pyblake2
$ python -m venv _venv
$ . _venv/bin/activate
$ export CFLAGS='${your-cflags-that-break-stuff}'
$ pip install -U .
$ python test/test.py

Because I feel like it should segv for me, and it doesn't. So that may be also -march=-related.
Comment 11 Larry the Git Cow gentoo-dev 2017-11-22 13:21:51 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=5dddb56946409beae23b9cfa513f81bcb77c531c

commit 5dddb56946409beae23b9cfa513f81bcb77c531c
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2017-11-22 13:21:05 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2017-11-22 13:21:45 +0000

    dev-python/pyblake2: Try to disable -ftree-vectorize to avoid segv
    
    Bug: https://bugs.gentoo.org/638428

 dev-python/pyblake2/pyblake2-1.0.0.ebuild | 5 +++++
 1 file changed, 5 insertions(+)}
Comment 12 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-11-22 13:23:40 UTC
Since I can't reproduce it, please let me know if that commit solved it with CFLAGS that were failing previously.
Comment 13 jy6x2b32pie9 2017-11-22 14:06:59 UTC
(In reply to Michał Górny from comment #12)
> Since I can't reproduce it, please let me know if that commit solved it with
> CFLAGS that were failing previously.

It appears that your commit is a solution. At least, it fixes problem for me.

Now, I made an issue upstream. https://github.com/dchest/pyblake2/issues/13
Comment 14 jy6x2b32pie9 2017-11-22 14:54:06 UTC
(In reply to Michał Górny from comment #12)
> Since I can't reproduce it, please let me know if that commit solved it with
> CFLAGS that were failing previously.

Upstream pushed equivalent commit and bumped version, so update to 1.0.1 would make your patch redundant.
Comment 15 Larry the Git Cow gentoo-dev 2017-11-22 16:49:03 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=4d5f5debfda38d14b37b25a1758be56a5e28f8b4

commit 4d5f5debfda38d14b37b25a1758be56a5e28f8b4
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2017-11-22 16:42:28 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2017-11-22 16:48:56 +0000

    dev-python/pyblake2: Bump to 1.0.1, with upstream -fno-tree-vectorize
    
    Bug: https://bugs.gentoo.org/638428

 dev-python/pyblake2/Manifest                                         | 2 +-
 dev-python/pyblake2/{pyblake2-1.0.0.ebuild => pyblake2-1.0.1.ebuild} | 5 -----
 2 files changed, 1 insertion(+), 6 deletions(-)}
Comment 16 Zac Medico gentoo-dev 2017-11-22 16:55:35 UTC
For temporary workaround, set PORTAGE_CHECKSUM_FILTER="* -blake2b" in /etc/portage/make.conf.
Comment 17 Larry the Git Cow gentoo-dev 2017-11-22 17:29:19 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=1a8386e80340387722af8bb40ce3c412cff0259c

commit 1a8386e80340387722af8bb40ce3c412cff0259c
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2017-11-22 17:24:42 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2017-11-22 17:29:15 +0000

    dev-python/pyblake2: Backport -fno-tree-vectorize to stable version
    
    Bug: https://bugs.gentoo.org/638428

 .../pyblake2/{pyblake2-0.9.3.ebuild => pyblake2-0.9.3-r1.ebuild}     | 5 +++++
 1 file changed, 5 insertions(+)}
Comment 18 Wojciech Myrda 2017-11-22 18:21:55 UTC
(In reply to Zac Medico from comment #16)
> For temporary workaround, set PORTAGE_CHECKSUM_FILTER="* -blake2b" in
> /etc/portage/make.conf.

running beow made portage get back on track for me.

PORTAGE_CHECKSUM_FILTER="* -blake2b" emerge -1 pyblake2

Thanks
Comment 19 Cynede gentoo-dev 2017-11-24 08:20:37 UTC
getting same segfaults (with clang-5) and pyblake2-1.0.1
Comment 20 M. B. 2017-11-24 11:41:23 UTC
Created attachment 506258 [details]
build.log (with clang settings)

As suggested on https://wiki.gentoo.org/wiki/Clang I am using separate config files to differentiate between GCC (7.2.0) and Clang (5.0.0).

Currently the segfaults occur every time a BLAKE2B hash gets computed, if pyblake2 was compiled with Clang. With GCC it works.

In particular, when fetching a source-file, portage simply reports
 * Fetch failed for 'app-admin/rsyslog-8.29.0'
instead of exposing the segfault.



As my current setup might be considered volatile, here are the values used for Clang and GCC:

# Clang configuration as shown in the attached build.log
CC="clang"
CXX="clang++"
CFLAGS="-flto=thin -march=broadwell -O2 -pipe"
CXXFLAGS="-flto=thin ${CFLAGS}"
LDFLAGS="-Wl,-O2 -Wl,--as-needed"
AR="llvm-ar"
NM="llvm-nm"
RANLIB="llvm-ranlib"
Comment 21 M. B. 2017-11-24 11:46:09 UTC
Created attachment 506260 [details]
build.log (with gcc settings)

# GCC configuration as shown in the attached build.log
CC="gcc"
CXX="g++"
CFLAGS="-flto=thin -mabm -frecord-gcc-switches ${CFLAGS}"
CXXFLAGS="-flto=thin -mabm -frecord-gcc-switches ${CXXFLAGS}"
FFLAGS="${FFLAGS} -frecord-gcc-switches"
FCFLAGS="${FCFLAGS} -frecord-gcc-switches"
AR="ar"
NM="nm"
RANLIB="ranlib"
Comment 22 Fabian Groffen gentoo-dev 2017-11-24 12:16:29 UTC
try using -march=ivybridge or dropping to -O1 with clang for pyblake2, your issue sounds like https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=fc0ba0fc7c93d56aeb421fbb8154fd57e7695623
Comment 23 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-11-24 14:38:52 UTC
Ok, I have an additional request to the people who can reproduce this. Could you please try building python:3.6 (with 'breaking' CFLAGS), unmerging pyblake2 and testing Portage via python3.6? It has a built-in BLAKE2 implementation, and I'd like to check if it's affected.
Comment 24 jy6x2b32pie9 2017-11-24 15:06:09 UTC
(In reply to Michał Górny from comment #23)
> Ok, I have an additional request to the people who can reproduce this. Could
> you please try building python:3.6 (with 'breaking' CFLAGS), unmerging
> pyblake2 and testing Portage via python3.6? It has a built-in BLAKE2
> implementation, and I'd like to check if it's affected.

It does not appear affected:

~ # cd /usr/portage/sys-apps/portage    
portage # PYTHON_TARGETS="python3_6" ebuild portage-2.3.16.ebuild unpack
 * portage-2.3.16.tar.bz2 BLAKE2B SHA512 size ;-) ...                                                                                                                                  [ ok ]
 * checking ebuild checksums ;-) ...                                                                                                                                                   [ ok ]
 * checking auxfile checksums ;-) ...                                                                                                                                                  [ ok ]
 * checking miscfile checksums ;-) ...                                                                                                                                                 [ ok ]
>>> Unpacking source...
>>> Unpacking portage-2.3.16.tar.bz2 to /var/tmp/portage/sys-apps/portage-2.3.16/work
>>> Source unpacked in /var/tmp/portage/sys-apps/portage-2.3.16/work
portage # emerge -aC pyblake2                                            
 * This action can remove important packages! In order to be safer, use
 * `emerge -pv --depclean <atom>` to check for reverse dependencies before
 * removing packages.

>>> These are the packages that would be unmerged:

--- Couldn't find 'pyblake2' to unmerge.

>>> No packages selected for removal by unmerge
portage # PYTHON_TARGETS="python3_6" ebuild portage-2.3.16.ebuild compile
...
>>> Source compiled.
Comment 25 Larry the Git Cow gentoo-dev 2017-11-24 17:54:03 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=95bff9cb86a241fcbb6702350b8472aca78ac3f0

commit 95bff9cb86a241fcbb6702350b8472aca78ac3f0
Author:     Michał Górny <mgorny@gentoo.org>
AuthorDate: 2017-11-24 17:49:43 +0000
Commit:     Michał Górny <mgorny@gentoo.org>
CommitDate: 2017-11-24 17:53:57 +0000

    dev-python/pyblake2: Bump to 1.1.0 (with impl from CPython)
    
    Update the package to the new release that features implementation
    copied from CPython git. This will hopefully solve all the optimization
    problems reported.
    
    Bug: https://bugs.gentoo.org/show_bug.cgi?id=638428

 dev-python/pyblake2/Manifest              |  1 +
 dev-python/pyblake2/pyblake2-1.1.0.ebuild | 20 ++++++++++++++++++++
 2 files changed, 21 insertions(+)}
Comment 26 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-11-24 17:56:00 UTC
Please test 1.1.0 now, in all failing environments. Hopefully reusing the code from CPython solves all the issues.
Comment 27 jy6x2b32pie9 2017-11-24 18:20:42 UTC
v1.1.0 passes test as shown gentoo bug, for me on my machine. Thank you.

Now, three to five other people need to test it. Waiting.
Comment 28 Psi 2017-11-24 19:43:13 UTC
dev-python/pyblake2-1.1.0 fixes this issue for me.  Nice to run into a problem and immediately discover it's been resolved an hour ago.  Thank you everyone for your work on this!

* Upgraded pyblake2-1.0.0 -> pyblake2-1.1.0 (all tests pass)
* Now able to upgrade portage-2.3.14 -> portage-2.3.16 (all tests pass)

FWIW, I use gcc-6.4, and have "-O2 -ftree-vectorize -fno-tree-loop-vectorize" in my CFLAGS, so -ftree-slp-vectorize is probably the trigger on gcc.  Think I'll just avoid the whole -ftree-vectorize family entirely from now on.  Doesn't do much without PGO anyway.

Thanks again!
Comment 29 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-11-24 20:15:50 UTC
Thanks for testing. Could someone with appropriate hardware also test with clang, please? Fabian, could you test if the OSX issue is resolved as well?
Comment 30 M. B. 2017-11-25 05:10:55 UTC
@Fabian: tl;dr: this issue cannot be resolved by changing the flags as you suggested.

In particular, I
- changed -march=ivybridge, kept -O2
- changed -O1, kept -march=broadwell
- changed -march=ivybridge -O1

All variations exhibited previous behavior.

@Michał
I confirmed that portage-2.3.16 is capable to emerge 2 packages (which have BLAKE2b hashes and previously failed for me) successfully, without pyblake2 installed and with FEATURES=python3_6.
So it seems the built-in blake2 implementation works fine. *thumbs up*
Same is true for pyblake2 with clang and my usual use-flags. Works like a charm as well.

Thanks.
Comment 31 Fabian Groffen gentoo-dev 2017-11-25 08:52:55 UTC
@mgorny: the issue was resolved for me after the change of feature flags (I suppose it disabled stuff for me) hence I removed the fugly workaround code immediately afterwards.
Comment 32 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-11-25 09:06:37 UTC
Ok, thanks. We'll reopen if anybody else hits this.
Comment 33 Mike Gilbert gentoo-dev 2017-11-27 17:25:44 UTC
*** Bug 638978 has been marked as a duplicate of this bug. ***
Comment 34 Perfect Gentleman 2017-12-01 09:36:26 UTC
(In reply to M. B. from comment #21)
> Created attachment 506260 [details]
> build.log (with gcc settings)
> 
> # GCC configuration as shown in the attached build.log
> CC="gcc"
> CXX="g++"
> CFLAGS="-flto=thin -mabm -frecord-gcc-switches ${CFLAGS}"
> CXXFLAGS="-flto=thin -mabm -frecord-gcc-switches ${CXXFLAGS}"
> FFLAGS="${FFLAGS} -frecord-gcc-switches"
> FCFLAGS="${FCFLAGS} -frecord-gcc-switches"
> AR="ar"
> NM="nm"
> RANLIB="ranlib"

waat? -flto=thin with GCC. rly? when it became applicable?
Comment 35 Lyall Pearce 2017-12-11 10:01:30 UTC
amd64 compiled with clang
I found that deleting/renaming /usr/lib64/python3.4/site-packages/pyblake2.cpython-34m.so fixed the core dump issue.

Unmerging pyblake2 also fixes it, until your next world update, at which time it will be re-installed.

FYI: A gdb stack trace...

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff3e1a58c in blake2b_init_param ()
   from /usr/lib64/python3.4/site-packages/pyblake2.cpython-34m.so
(gdb) where
#0  0x00007ffff3e1a58c in blake2b_init_param ()
   from /usr/lib64/python3.4/site-packages/pyblake2.cpython-34m.so
#1  0x00007ffff3e19bd5 in ?? ()
   from /usr/lib64/python3.4/site-packages/pyblake2.cpython-34m.so
#2  0x000000305491899a in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#3  0x000000305491b70e in ?? () from /usr/lib64/libpython3.4m.so.1.0
#4  0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#5  0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#6  0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#7  0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#8  0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#9  0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#10 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#11 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#12 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#13 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#14 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#15 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#16 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#17 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#18 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#19 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#20 0x000000305491b70e in ?? () from /usr/lib64/libpython3.4m.so.1.0
#21 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#22 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#23 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#24 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#25 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#26 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#27 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#28 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#29 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#30 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#31 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#32 0x000000305491b786 in ?? () from /usr/lib64/libpython3.4m.so.1.0
#33 0x00000030549184cb in PyEval_EvalFrameEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#34 0x0000003054912ba7 in PyEval_EvalCodeEx ()
   from /usr/lib64/libpython3.4m.so.1.0
#35 0x0000003054911ec5 in PyEval_EvalCode ()
   from /usr/lib64/libpython3.4m.so.1.0
#36 0x000000305493e70f in PyRun_FileExFlags ()
   from /usr/lib64/libpython3.4m.so.1.0
#37 0x000000305493dc0e in PyRun_SimpleFileExFlags ()
   from /usr/lib64/libpython3.4m.so.1.0
#38 0x00000030549556ac in Py_Main () from /usr/lib64/libpython3.4m.so.1.0
#39 0x0000000000400a86 in main ()
(gdb)
Comment 36 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2017-12-11 12:46:07 UTC
For a start, please explicitly specify pyblake2 version.
Comment 37 Lyall Pearce 2017-12-11 23:59:47 UTC
(In reply to Michał Górny from comment #36)
> For a start, please explicitly specify pyblake2 version.

was pyblake2-0.9.3-r1 but I keyworded to ~amd64 and that brings it up to 1.1.0.

1.1.0 did not core dump.

More info is available in a forum post https://forums.gentoo.org/viewtopic-t-1073600.html