Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 266642

Summary: dev-libs/libgcrypt-1.4.4 compiled with -O3 leads to "test encryption failed" error in x11-plugins/enigmail and app-crypt/gnupg
Product: Gentoo Linux Reporter: Guenther Brunthaler <gb_about_gnu>
Component: [OLD] LibraryAssignee: Crypto team [DISABLED] <crypto+disabled>
Status: RESOLVED WORKSFORME    
Severity: normal CC: esigra, olechrt
Priority: High    
Version: unspecified   
Hardware: AMD64   
OS: Linux   
URL: https://bugzilla.novell.com/show_bug.cgi?id=443693
Whiteboard:
Package list:
Runtime testing required: ---
Bug Depends on:    
Bug Blocks: 915000    
Attachments: Working Portage override script with -O3 *and* -fno-strict-aliasing
Non-working Portage override script using -O3 only

Description Guenther Brunthaler 2009-04-18 11:54:24 UTC
After compiling =dev-libs/libgcrypt-1.4.4 with =sys-devel/gcc-4.3.2-r3 and -O3 on amd64, AES-encryption routines in libgcrypt fail to work and report a "test encryption failed" error instead.

This directly affects packages using libgcrypt, such as GnuPG and EnigMail.

Reproducible: Always

Steps to Reproduce:
1.Build =dev-libs/libgcrypt-1.4.4 on amd64 with gcc-4.3.2 and -O3 enabled
2.Try to use GnuPG in a case it uses AES encryption


Actual Results:  
"Test encryption failed" error message

Expected Results:  
Should work without an error message

As the link in the URL above implies, the problem is caused by unclean code in libgcrypt which has problems with one of the -O3 optimizations in gcc-4.3.2 (at least on amd64).

In order to make the problem go away, there are 2 solutions:

1.) Replace any -O3 or higher in CFLAGS by -O2 when building.

2.) [preferred] Add the -fno-strict-aliasing option at the end of the CFLAGS to be used. This allows -O3 to work then without problems, too.
Comment 1 Guenther Brunthaler 2009-04-18 11:56:31 UTC
Portage 2.1.6.7 (default/linux/amd64/2008.0/desktop, gcc-4.3.2, glibc-2.8_p20080602-r1, 2.6.27-gentoo-r8-xquad-9.27 x86_64)
=================================================================
System uname: Linux-2.6.27-gentoo-r8-xquad-9.27-x86_64-AMD_Phenom-tm-_9600_Quad-Core_Processor-with-glibc2.2.5
Timestamp of tree: Thu, 16 Apr 2009 11:45:01 +0000
distcc 2.18.3 x86_64-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.4 [enabled]
app-shells/bash:     3.2_p39
dev-java/java-config: 2.1.7
dev-lang/python:     2.5.2-r7
dev-python/pycrypto: 2.0.1-r8
dev-util/ccache:     2.4-r7
dev-util/cmake:      2.6.2-r1
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.63
sys-devel/automake:  1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.27-r2
ACCEPT_KEYWORDS="amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=k8 -O2 -DNDEBUG -pipe -fno-stack-check"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/local/etc /usr/share/X11/xkb /usr/share/config /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/host-variants/ /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-march=k8 -O2 -DNDEBUG -pipe -fno-stack-check"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--nospinner --with-bdeps=y"
FEATURES="ccache distlocks fixpackages notitles prelink protect-owned sandbox sfperms strict unmerge-orphans userfetch userpriv usersandbox"
GENTOO_MIRRORS="http://lug.mtu.edu/gentoo/ ftp://gentoo.mirrors.tds.net/gentoo http://gentoo.mirrors.tds.net/gentoo http://gentoo.chem.wisc.edu/gentoo/ http://gentoo-euetib.upc.es/mirror/gentoo/ ftp://gentoo.chem.wisc.edu/gentoo/ ftp://distro.ibiblio.org/pub/linux/distributions/gentoo/ ftp://gentoo.in.th/ http://ftp.twaren.net/Linux/Gentoo/ ftp://ftp.twaren.net/Linux/Gentoo/"
LANG="de_AT.utf8"
LDFLAGS="-Wl,-O1"
LINGUAS="de"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_COMPRESS="lzma"
PORTAGE_COMPRESS_FLAGS="-9"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/portage/local/layman/mscgen /usr/portage/local/layman/xworld /usr/portage/local/layman/simplux /usr/portage/local/layman/xworld_attic /usr/portage/local/layman/xworld_serviced /usr/portage/local/layman/xworld_hotfixes /usr/portage/local/layman/xworld_thirdparty /usr/portage/local/overlay"
SYNC="rsync://rsync.de.gentoo.org/gentoo-portage"
USE="3dnow 3dnowext X a52 aac aalib acpi alsa amd64 apache2 arts aspell audiofile bash-completion berkdb branding bzip2 cairo caps cddb cdr cleartype cli cracklib crypt css cups curl custom-cflags custom-cxxflags dbus dri dts dv dvd dvdr dvdread ecc emboss encode evo exif expat ffmpeg fftw firefox flac foomaticdb fortran freetype ftp fuse gd gdbm gif gimp glade glut gmp gphoto2 gpm gtk gtk2 hal iconv idea ieee1394 imagemagick imlib isdnlog jack java6 javascript jbig jp2 jpeg jpeg2k kde kdeenablefinal kdehiddenvisibility kdexdeltas kipi kpathsea lame lcms ldap libcaca libclamav libnotify libsamplerate logrotate lzma lzo mad matroska midi mikmod mmap mmx mmxext mng mp3 mpeg mudflap mule multilib musepack musicbrainz ncurses nls nodrm nptl nptlonly nsplugin oav ocamlopt odbc offensive ofx ogg openal opengl openmp pam pcre pdf perl pic png ppds pppd pulseaudio python qt qt3 qt3support qt4 quicktime readline reflection samba sasl screen sdl session sharedmem slang smartcard sndfile sox speex spell spl sqlite sse sse2 sse3 sse4a ssl startup-notification svg sysfs tcltk tetex theora threads threadsafe tiff tk truetype unicode usb userlocales utf8 vcd vde vorbis wxwindows x264 xft xml xorg xosd xpm xrandr xscreensaver xsl xulrunner xv xvid xvmc zlib" ALSA_CARDS="emu10k1" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev joystick keyboard mouse void" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de" USERLAND="GNU" VIDEO_CARDS="dummy radeon v4l vesa vga"
Unset:  CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_RSYNC_EXTRA_OPTS

Comment 2 Guenther Brunthaler 2009-04-18 12:01:53 UTC
Just in case I did not express myself clearly in the previous posting: The -O2 or
-fno-strict-aliasing is only required for building libgcrypt.

EnigMail or GnuPG build without any problems using -O3 as is; there is no requirement for adding -fno-strict-aliasing to those ebuilds.
Comment 3 Arfrever Frehtes Taifersar Arahesis (RETIRED) gentoo-dev 2009-05-01 13:41:34 UTC
I have installed dev-libs/libgcrypt-1.4.4 with -O3, but I'm not sure how to reproduce this bug. Please write precise steps to reproduce this bug.
Comment 4 Guenther Brunthaler 2009-05-01 14:22:31 UTC
(In reply to comment #3)
> I have installed dev-libs/libgcrypt-1.4.4 with -O3, but I'm not sure how to
> reproduce this bug.

This seems to be a compiler optimization issue, so it might only apply if exactly the same compiler version and build platform are used.

In my case, this would be amd64 and gcc-4.3.2.

The "-march"-setting might also have some influence - I was using "-march=k8" at the time of the compilation.

The origin of the problem seems to be that gcc-4.3 considers "strict aliasing" to be *stricter* than the (obviously) lazily written code in libgcrypt can digest it. And -fstrict-aliasing is enabled by default with -O3.

> Please write precise steps to reproduce this bug.

I did nothing special other then emerging the packaged (that is, actually I did it indirectly by updating gnupg) with "heavy optimization"-flag overrides in place.

To your reference, I will post my overrides script as an attachment, which has to go to /etc/portage/env/<category>/<package> in order to affect a package.

However, setting the modified variables directly in the command line or in /etc/make.conf should have the same effect.

You thus might try:

$ CFLAGS="-march=k8 -O3 -DNDEBUG -fomit-frame-pointer -fno-stack-check" CHOST=x86_64-pc-linux-gnu ebuild `equery which =libgcrypt-1.4.4` clean compile

(which should not work) and

$ CFLAGS="-march=k8 -O3 -DNDEBUG -fno-omit-frame-pointer -fno-stack-check -fno-strict-aliasing" CHOST=x86_64-pc-linux-gnu ebuild `equery which =libgcrypt-1.4.4` clean compile

which should work.
Comment 5 Guenther Brunthaler 2009-05-01 14:28:31 UTC
Created attachment 190033 [details]
Working Portage override script with -O3 *and* -fno-strict-aliasing

Save this script as

/etc/portage/env/dev-libs/libgcrypt

in order to make libgcrypt compile without problems.

This is the Portage env-override script I use which *does* work.

It includes -fno-strict-aliasing in addition to -O3.

Note that my normal global CFLAGS only use -O2, because my default I prefer short code over fast code as most packages in the tree are not performance-critical.
Comment 6 Arfrever Frehtes Taifersar Arahesis (RETIRED) gentoo-dev 2009-05-01 14:29:07 UTC
(In reply to comment #4)
> > Please write precise steps to reproduce this bug.
> 
> I did nothing special other then emerging the packaged (that is, actually I
> did it indirectly by updating gnupg) with "heavy optimization"-flag overrides
> in place.

How to reproduce "Test encryption failed" error message?
Comment 7 Guenther Brunthaler 2009-05-01 14:31:40 UTC
Created attachment 190036 [details]
Non-working Portage override script using -O3 only

If you save *this* script as /etc/portage/env/dev-libs/libgcrypt, you will encounter the effect described in this bug report, provided you use the same basic CFLAGS (in /etc/make.conf or in the command line) as well:

CFLAGS="-march=k8 -O2 -DNDEBUG -pipe -fno-stack-check"
Comment 8 Guenther Brunthaler 2009-05-01 14:37:56 UTC
(In reply to comment #5)
> Created an attachment (id=190033) [edit]

Using this environment override script can also be considered to be the hotfix for this bug.

However, the -fno-strict-aliasing should really go into the ebuild rather than coming from a hotfix.
Comment 9 Guenther Brunthaler 2009-05-01 14:43:54 UTC
(In reply to comment #6)
> How to reproduce "Test encryption failed" error message?

Oh, that's simple: Just use gnupg to decrypt any encrypted message using AES as the actual encryption (that is, the RSA/DSA-encrypted session key will encrypt the actual message contents with AES), and the message will be displayed.
Comment 10 Guenther Brunthaler 2009-05-01 15:33:46 UTC
What the ...? I am no longer able to reproduce the bug myself!!! ;-)

I just reinstalled the library with the old, non-working setting, in order to be able to provide you with a better test case, such as

$ echo test | gpg --batch -c --cipher-algo=aes --passphrase=unsecure > /dev/null

BUT... suddenly it all works even without -fno-strict-aliasing! ;-)

I am really clueless how this would be possible - I even made sure gpg-agent was killed before running the test.

Currently, I have no idea how this should be possible - but the bug is well known at least in the Novell bug tracker, as you can see in Comment #16 on
https://bugzilla.novell.com/show_bug.cgi?id=443693

I did encounter exactly the same same problems as described there - and not I'm not; not any longer.

I even cleared the compiler cache to be sure - but still no joy.

As I am no longer able to reproduce the bug myself, we might close it with resolution WORKSFORME if you prefer.

However, the bug is still hidden somewhere, and might re-surface sooner or later... as the Novell bugtracker reports, the only correct resolution would be to add -fno-strict-aliasing to the package's CFLAGS (it had been resolved as FIXED in the Novell bug tracker then).

Is it possible the new GCC-4.3 code generator might incorporate monte-carlo based heuristical optimization algorithms?

That might explain why the generated code might not always behave the same.

OTOH, I never have heard about such code-generation strategies in GCC-4.

Anyway, it really puzzles me why the library suddenly works!

And, BTW: I have also had encountered the very same bug on a x86 installation in meantime - it therefore seems not to be an amd64-specific issue but rather exclusively a gcc-4.3.2-specific issue.

But as long as it works... at least now for the moment... ;-)
Comment 11 Guenther Brunthaler 2009-05-01 16:07:06 UTC
In addition to no longer being able to reproduce the bug on amd64, the issue has disappeared miracuously on x86 also now.

I still have no idea how this would be possible, but the powers-that-be seem to have decided that the bug suddenly has to disappear without actually changing anything.

Anyway, the impossible is good enough for me as long as it actually *works*; no matter how.

I am therefore closing the bug with resolution WORKSFORME and want to apologize to the Crypto Team for the effort the bug report has created!

I can only say that it *had* been reproducible at the time the bug was reported; but no longer is for reasons unclear to me.
Comment 12 Ole Christian Tvedt 2009-06-26 00:05:35 UTC
I am getting this exact same problem, except on a -march=nocona. Adding -fno-strict-aliasing to my CFLAGS solved the problem.

I first noticed this problem in Mutt, where I got the following on startup:
Connecting to [host_name]...AES-128 test encryption failed.
gnutls_handshake: A TLS fatal alert has been received.(Bad record MAC)

I'd be happy to dig into the code if someone could give me a tip what to look for. What is this aliasing problem?
Comment 13 Guenther Brunthaler 2009-06-26 00:32:31 UTC
(In reply to comment #12)
> I'd be happy to dig into the code if someone could give me a tip what to look
> for. What is this aliasing problem?

Look here for an explanation

http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html

But I have doubts it will be easy to remove the problem from the code without thoroughly scrutinizing it.

Generally, if code needs -fno-strict-aliasing in order to function properly, it is an indicator for either poor code quality or for hackish, non-portable code which uses features of the C language which are labelled as "implementation specific" by the ANSI-C standard. (In other words, features which every new compiler version could implement to behave differently without contradicting the language standard.)

Clean, portable code will never require "-fno-strict-aliasing".