Summary: | dev-libs/libgcrypt-1.4.4 compiled with -O3 leads to "test encryption failed" error in x11-plugins/enigmail and app-crypt/gnupg | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Guenther Brunthaler <gb_about_gnu> |
Component: | [OLD] Library | Assignee: | Crypto team [DISABLED] <crypto+disabled> |
Status: | RESOLVED WORKSFORME | ||
Severity: | normal | CC: | esigra, olechrt |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
URL: | https://bugzilla.novell.com/show_bug.cgi?id=443693 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 915000 | ||
Attachments: |
Working Portage override script with -O3 *and* -fno-strict-aliasing
Non-working Portage override script using -O3 only |
Description
Guenther Brunthaler
2009-04-18 11:54:24 UTC
Portage 2.1.6.7 (default/linux/amd64/2008.0/desktop, gcc-4.3.2, glibc-2.8_p20080602-r1, 2.6.27-gentoo-r8-xquad-9.27 x86_64) ================================================================= System uname: Linux-2.6.27-gentoo-r8-xquad-9.27-x86_64-AMD_Phenom-tm-_9600_Quad-Core_Processor-with-glibc2.2.5 Timestamp of tree: Thu, 16 Apr 2009 11:45:01 +0000 distcc 2.18.3 x86_64-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] ccache version 2.4 [enabled] app-shells/bash: 3.2_p39 dev-java/java-config: 2.1.7 dev-lang/python: 2.5.2-r7 dev-python/pycrypto: 2.0.1-r8 dev-util/ccache: 2.4-r7 dev-util/cmake: 2.6.2-r1 sys-apps/baselayout: 1.12.11.1 sys-apps/sandbox: 1.6-r2 sys-devel/autoconf: 2.13, 2.63 sys-devel/automake: 1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2 sys-devel/binutils: 2.18-r3 sys-devel/gcc-config: 1.4.0-r4 sys-devel/libtool: 1.5.26 virtual/os-headers: 2.6.27-r2 ACCEPT_KEYWORDS="amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-march=k8 -O2 -DNDEBUG -pipe -fno-stack-check" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/local/etc /usr/share/X11/xkb /usr/share/config /var/lib/hsqldb" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/host-variants/ /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d" CXXFLAGS="-march=k8 -O2 -DNDEBUG -pipe -fno-stack-check" DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--nospinner --with-bdeps=y" FEATURES="ccache distlocks fixpackages notitles prelink protect-owned sandbox sfperms strict unmerge-orphans userfetch userpriv usersandbox" GENTOO_MIRRORS="http://lug.mtu.edu/gentoo/ ftp://gentoo.mirrors.tds.net/gentoo http://gentoo.mirrors.tds.net/gentoo http://gentoo.chem.wisc.edu/gentoo/ http://gentoo-euetib.upc.es/mirror/gentoo/ ftp://gentoo.chem.wisc.edu/gentoo/ ftp://distro.ibiblio.org/pub/linux/distributions/gentoo/ ftp://gentoo.in.th/ http://ftp.twaren.net/Linux/Gentoo/ ftp://ftp.twaren.net/Linux/Gentoo/" LANG="de_AT.utf8" LDFLAGS="-Wl,-O1" LINGUAS="de" MAKEOPTS="-j5" PKGDIR="/usr/portage/packages" PORTAGE_COMPRESS="lzma" PORTAGE_COMPRESS_FLAGS="-9" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/portage/local/layman/mscgen /usr/portage/local/layman/xworld /usr/portage/local/layman/simplux /usr/portage/local/layman/xworld_attic /usr/portage/local/layman/xworld_serviced /usr/portage/local/layman/xworld_hotfixes /usr/portage/local/layman/xworld_thirdparty /usr/portage/local/overlay" SYNC="rsync://rsync.de.gentoo.org/gentoo-portage" USE="3dnow 3dnowext X a52 aac aalib acpi alsa amd64 apache2 arts aspell audiofile bash-completion berkdb branding bzip2 cairo caps cddb cdr cleartype cli cracklib crypt css cups curl custom-cflags custom-cxxflags dbus dri dts dv dvd dvdr dvdread ecc emboss encode evo exif expat ffmpeg fftw firefox flac foomaticdb fortran freetype ftp fuse gd gdbm gif gimp glade glut gmp gphoto2 gpm gtk gtk2 hal iconv idea ieee1394 imagemagick imlib isdnlog jack java6 javascript jbig jp2 jpeg jpeg2k kde kdeenablefinal kdehiddenvisibility kdexdeltas kipi kpathsea lame lcms ldap libcaca libclamav libnotify libsamplerate logrotate lzma lzo mad matroska midi mikmod mmap mmx mmxext mng mp3 mpeg mudflap mule multilib musepack musicbrainz ncurses nls nodrm nptl nptlonly nsplugin oav ocamlopt odbc offensive ofx ogg openal opengl openmp pam pcre pdf perl pic png ppds pppd pulseaudio python qt qt3 qt3support qt4 quicktime readline reflection samba sasl screen sdl session sharedmem slang smartcard sndfile sox speex spell spl sqlite sse sse2 sse3 sse4a ssl startup-notification svg sysfs tcltk tetex theora threads threadsafe tiff tk truetype unicode usb userlocales utf8 vcd vde vorbis wxwindows x264 xft xml xorg xosd xpm xrandr xscreensaver xsl xulrunner xv xvid xvmc zlib" ALSA_CARDS="emu10k1" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev joystick keyboard mouse void" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="de" USERLAND="GNU" VIDEO_CARDS="dummy radeon v4l vesa vga" Unset: CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_RSYNC_EXTRA_OPTS Just in case I did not express myself clearly in the previous posting: The -O2 or -fno-strict-aliasing is only required for building libgcrypt. EnigMail or GnuPG build without any problems using -O3 as is; there is no requirement for adding -fno-strict-aliasing to those ebuilds. I have installed dev-libs/libgcrypt-1.4.4 with -O3, but I'm not sure how to reproduce this bug. Please write precise steps to reproduce this bug. (In reply to comment #3) > I have installed dev-libs/libgcrypt-1.4.4 with -O3, but I'm not sure how to > reproduce this bug. This seems to be a compiler optimization issue, so it might only apply if exactly the same compiler version and build platform are used. In my case, this would be amd64 and gcc-4.3.2. The "-march"-setting might also have some influence - I was using "-march=k8" at the time of the compilation. The origin of the problem seems to be that gcc-4.3 considers "strict aliasing" to be *stricter* than the (obviously) lazily written code in libgcrypt can digest it. And -fstrict-aliasing is enabled by default with -O3. > Please write precise steps to reproduce this bug. I did nothing special other then emerging the packaged (that is, actually I did it indirectly by updating gnupg) with "heavy optimization"-flag overrides in place. To your reference, I will post my overrides script as an attachment, which has to go to /etc/portage/env/<category>/<package> in order to affect a package. However, setting the modified variables directly in the command line or in /etc/make.conf should have the same effect. You thus might try: $ CFLAGS="-march=k8 -O3 -DNDEBUG -fomit-frame-pointer -fno-stack-check" CHOST=x86_64-pc-linux-gnu ebuild `equery which =libgcrypt-1.4.4` clean compile (which should not work) and $ CFLAGS="-march=k8 -O3 -DNDEBUG -fno-omit-frame-pointer -fno-stack-check -fno-strict-aliasing" CHOST=x86_64-pc-linux-gnu ebuild `equery which =libgcrypt-1.4.4` clean compile which should work. Created attachment 190033 [details]
Working Portage override script with -O3 *and* -fno-strict-aliasing
Save this script as
/etc/portage/env/dev-libs/libgcrypt
in order to make libgcrypt compile without problems.
This is the Portage env-override script I use which *does* work.
It includes -fno-strict-aliasing in addition to -O3.
Note that my normal global CFLAGS only use -O2, because my default I prefer short code over fast code as most packages in the tree are not performance-critical.
(In reply to comment #4) > > Please write precise steps to reproduce this bug. > > I did nothing special other then emerging the packaged (that is, actually I > did it indirectly by updating gnupg) with "heavy optimization"-flag overrides > in place. How to reproduce "Test encryption failed" error message? Created attachment 190036 [details]
Non-working Portage override script using -O3 only
If you save *this* script as /etc/portage/env/dev-libs/libgcrypt, you will encounter the effect described in this bug report, provided you use the same basic CFLAGS (in /etc/make.conf or in the command line) as well:
CFLAGS="-march=k8 -O2 -DNDEBUG -pipe -fno-stack-check"
(In reply to comment #5) > Created an attachment (id=190033) [edit] Using this environment override script can also be considered to be the hotfix for this bug. However, the -fno-strict-aliasing should really go into the ebuild rather than coming from a hotfix. (In reply to comment #6) > How to reproduce "Test encryption failed" error message? Oh, that's simple: Just use gnupg to decrypt any encrypted message using AES as the actual encryption (that is, the RSA/DSA-encrypted session key will encrypt the actual message contents with AES), and the message will be displayed. What the ...? I am no longer able to reproduce the bug myself!!! ;-) I just reinstalled the library with the old, non-working setting, in order to be able to provide you with a better test case, such as $ echo test | gpg --batch -c --cipher-algo=aes --passphrase=unsecure > /dev/null BUT... suddenly it all works even without -fno-strict-aliasing! ;-) I am really clueless how this would be possible - I even made sure gpg-agent was killed before running the test. Currently, I have no idea how this should be possible - but the bug is well known at least in the Novell bug tracker, as you can see in Comment #16 on https://bugzilla.novell.com/show_bug.cgi?id=443693 I did encounter exactly the same same problems as described there - and not I'm not; not any longer. I even cleared the compiler cache to be sure - but still no joy. As I am no longer able to reproduce the bug myself, we might close it with resolution WORKSFORME if you prefer. However, the bug is still hidden somewhere, and might re-surface sooner or later... as the Novell bugtracker reports, the only correct resolution would be to add -fno-strict-aliasing to the package's CFLAGS (it had been resolved as FIXED in the Novell bug tracker then). Is it possible the new GCC-4.3 code generator might incorporate monte-carlo based heuristical optimization algorithms? That might explain why the generated code might not always behave the same. OTOH, I never have heard about such code-generation strategies in GCC-4. Anyway, it really puzzles me why the library suddenly works! And, BTW: I have also had encountered the very same bug on a x86 installation in meantime - it therefore seems not to be an amd64-specific issue but rather exclusively a gcc-4.3.2-specific issue. But as long as it works... at least now for the moment... ;-) In addition to no longer being able to reproduce the bug on amd64, the issue has disappeared miracuously on x86 also now. I still have no idea how this would be possible, but the powers-that-be seem to have decided that the bug suddenly has to disappear without actually changing anything. Anyway, the impossible is good enough for me as long as it actually *works*; no matter how. I am therefore closing the bug with resolution WORKSFORME and want to apologize to the Crypto Team for the effort the bug report has created! I can only say that it *had* been reproducible at the time the bug was reported; but no longer is for reasons unclear to me. I am getting this exact same problem, except on a -march=nocona. Adding -fno-strict-aliasing to my CFLAGS solved the problem. I first noticed this problem in Mutt, where I got the following on startup: Connecting to [host_name]...AES-128 test encryption failed. gnutls_handshake: A TLS fatal alert has been received.(Bad record MAC) I'd be happy to dig into the code if someone could give me a tip what to look for. What is this aliasing problem? (In reply to comment #12) > I'd be happy to dig into the code if someone could give me a tip what to look > for. What is this aliasing problem? Look here for an explanation http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html But I have doubts it will be easy to remove the problem from the code without thoroughly scrutinizing it. Generally, if code needs -fno-strict-aliasing in order to function properly, it is an indicator for either poor code quality or for hackish, non-portable code which uses features of the C language which are labelled as "implementation specific" by the ANSI-C standard. (In other words, features which every new compiler version could implement to behave differently without contradicting the language standard.) Clean, portable code will never require "-fno-strict-aliasing". |