Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 545492 - =sys-kernel/hardened-sources-3.19.3: crash: PAX: size overflow detected in function async_copy_data.isra.38
Summary: =sys-kernel/hardened-sources-3.19.3: crash: PAX: size overflow detected in fu...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Hardened (show other bugs)
Hardware: AMD64 Linux
: Normal normal (vote)
Assignee: The Gentoo Linux Hardened Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-04-04 10:26 UTC by jack_mort
Modified: 2015-05-10 14:24 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Stack trace + debug from raid5.c (file_545492.txt,10.44 KB, text/plain)
2015-04-06 15:36 UTC, jack_mort
Details
Stack trace from 3.19.3-r1 (file_545492.txt,1.95 KB, text/plain)
2015-04-11 08:32 UTC, jack_mort
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jack_mort 2015-04-04 10:26:58 UTC
I'm getting crashes since 3.19 series and I was able to get the crash log today on 3.19.3. Kernel boots fine, and after few minutes, throws a size overflow error and cannot access my raid array anymore.
A hard reboot has then to be done.

[avril 4 11:38] PAX: size overflow detected in function async_copy_data.isra.38 drivers/md/raid5.c:946 cicus.1056_137 min, count: 60
[  +0,000012] CPU: 0 PID: 2210 Comm: md127_raid5 Tainted: G           O   3.19.3-hardened #1
[  +0,000004] Hardware name: MSI MS-7592/G41M-P33 Combo(MS-7592), BIOS V32.12 09/13/2013
[  +0,000003]  2e62af56a4220b55 ffffffffa011f51e 0000000000000000 ffffffffa011f51e
[  +0,000007]  ffffffff81609dfc ffffffffa011f61e ffffffff8114a055 0000000000080000
[  +0,000006]  00000000dedfac08 ffff8800c6b17180 ffff8800c6b175f0 0000000000000002
[  +0,000006] Call Trace:
[  +0,000030]  [<ffffffffa011f51e>] ? raid5_exit+0x51e/0x2bd8 [raid456]
[  +0,000011]  [<ffffffffa011f51e>] ? raid5_exit+0x51e/0x2bd8 [raid456]
[  +0,000008]  [<ffffffff81609dfc>] ? dump_stack+0x40/0x56
[  +0,000010]  [<ffffffffa011f61e>] ? raid5_exit+0x61e/0x2bd8 [raid456]
[  +0,000007]  [<ffffffff8114a055>] ? report_size_overflow+0x35/0x40
[  +0,000011]  [<ffffffffa0117ca5>] ? async_copy_data.isra.38+0x405/0x470 [raid456]
[  +0,000011]  [<ffffffffa00f8141>] ? async_xor+0x141/0x180 [async_xor]
[  +0,000010]  [<ffffffffa01183e3>] ? raid_run_ops+0x6d3/0xfa0 [raid456]
[  +0,000010]  [<ffffffffa0115670>] ? release_stripe+0x100/0x100 [raid456]
[  +0,000010]  [<ffffffffa011bdd8>] ? handle_stripe+0xbf8/0x2170 [raid456]
[  +0,000007]  [<ffffffff8109be15>] ? sched_clock_local+0x15/0x80
[  +0,000006]  [<ffffffff8109c068>] ? sched_clock_cpu+0x88/0xb0
[  +0,000006]  [<ffffffff810a30ab>] ? pick_next_task_fair+0x33b/0x480
[  +0,000010]  [<ffffffffa011d4ae>] ? handle_active_stripes.isra.39+0x15e/0x3d0 [raid456]
[  +0,000010]  [<ffffffffa011dace>] ? raid5d+0x30e/0x4d0 [raid456]
[  +0,000015]  [<ffffffffa00a5c29>] ? md_thread+0x139/0x140 [md_mod]
[  +0,000006]  [<ffffffff810a7de0>] ? wait_woken+0xa0/0xa0
[  +0,000012]  [<ffffffffa00a5af0>] ? md_start_sync+0xf0/0xf0 [md_mod]
[  +0,000007]  [<ffffffff810905ff>] ? kthread+0xdf/0x100
[  +0,000005]  [<ffffffff81090520>] ? kthread_create_on_node+0x170/0x170
[  +0,000007]  [<ffffffff8160f219>] ? ret_from_fork+0x49/0x80
[  +0,000006]  [<ffffffff81090520>] ? kthread_create_on_node+0x170/0x170

Reproducible: Always

Steps to Reproduce:
1. Build hardened sources 3.19.3 with raid support
2. Reboot with new kernel
3. Wait a few minutes
Actual Results:  
Raid crashes and is unusable. I had to switch back to 3.18.9 kernel.


Portage 2.2.18 (python 3.4.3-final-0, hardened/linux/amd64, gcc-4.9.2-vanilla, glibc-2.20-r2, 3.19.3-hardened x86_64)
=================================================================
System uname: Linux-3.18.9-hardened-x86_64-Genuine_Intel-R-_CPU_2160_@_1.80GHz-with-gentoo-2.2
KiB Mem:     4007456 total,   1187884 free
KiB Swap:     153596 total,    153596 free
Timestamp of repository gentoo: Fri, 03 Apr 2015 07:30:01 +0000
sh bash 4.3_p33-r2
ld GNU ld (Gentoo 2.25 p1.0) 2.25
ccache version 3.2.1 [enabled]
app-shells/bash:          4.3_p33-r2::gentoo
dev-java/java-config:     2.2.0::gentoo
dev-lang/perl:            5.20.2::gentoo
dev-lang/python:          2.7.9-r2::gentoo, 3.4.3::gentoo
dev-util/ccache:          3.2.1-r1::gentoo
dev-util/cmake:           3.1.0::gentoo
dev-util/pkgconfig:       0.28-r2::gentoo
sys-apps/baselayout:      2.2::gentoo
sys-apps/openrc:          0.13.11::gentoo
sys-apps/sandbox:         2.6-r1::gentoo
sys-devel/autoconf:       2.69-r1::gentoo
sys-devel/automake:       1.14.1::gentoo, 1.15::gentoo
sys-devel/binutils:       2.25::gentoo
sys-devel/gcc:            4.9.2::gentoo
sys-devel/gcc-config:     1.8::gentoo
sys-devel/libtool:        2.4.6-r1::gentoo
sys-devel/make:           4.1-r1::gentoo
sys-kernel/linux-headers: 3.19::gentoo (virtual/os-headers)
sys-libs/glibc:           2.20-r2::gentoo
Repositories:

gentoo
    location: /usr/portage
    sync-type: rsync
    sync-uri: rsync://rsync.gentoo.org/gentoo-portage
    priority: -1000

x-portage
    location: /usr/local/portage
    masters: gentoo
    priority: 0

ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA dlj-1.1 Oracle-BCLA-JavaSE"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /var/bind"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/php/apache2-php5.6/ext-active/ /etc/php/cgi-php5.6/ext-active/ /etc/php/cli-php5.6/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe"
DISTDIR="/usr/portage/distfiles"
FCFLAGS="-O2 -pipe"
FEATURES="assume-digests binpkg-logs buildsyspkg ccache config-protect-if-modified distlocks ebuild-locks fixlafiles merge-sync news parallel-fetch preserve-libs protect-owned sandbox sfperms strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync xattr"
FFLAGS="-O2 -pipe"
GENTOO_MIRRORS="http://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ http://gentoo.inode.at/ http://mirrors.sec.informatik.tu-darmstadt.de/gentoo"
LANG="fr_FR.utf8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --omit-dir-times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
USE="acl acpi amd64 apache2 bzip2 caps ccache cli cracklib crypt cxx device-mapper gnutls hardened iconv ipv6 jpeg justify libav logrotate mmx mmxext modules multilib mysql mysqli ncurses nls nptl openmp pam pax_kernel pcre php png readline samba session smp snmp sse sse2 ssl syslog tcpd threads truetype unicode urandom usb vhosts xattr xinetd xml xtpax zlib" ABI_X86="32 64" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" APACHE2_MODULES="access actions alias auth_basic auth_digest authn_anon authn_core authn_dbd authn_dbm authn_default authn_file authz_core authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi compat dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info lbmethod_byrequests lbmethod_bytraffic lbmethod_bybusyness lbmethod_heartbeat log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif slotmem_shm so socache_shmcb speling status unique_id unixd userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump author" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" CPU_FLAGS_X86="mmx mmxext sse sse2 sse3 ssse3" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ublox ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIBREOFFICE_EXTENSIONS="presenter-console presenter-minimizer" LINGUAS="fr" OFFICE_IMPLEMENTATION="libreoffice" PHP_TARGETS="php5-6" PYTHON_SINGLE_TARGET="python3_4" PYTHON_TARGETS="python2_7 python3_4" RUBY_TARGETS="ruby19 ruby20" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
USE_PYTHON="2.7 3.4"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LC_ALL, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 PaX Team 2015-04-04 18:29:13 UTC
please see https://bugs.gentoo.org/show_bug.cgi?id=545192#c8 for further instructions to help us debug this.
Comment 2 jack_mort 2015-04-06 15:36:28 UTC
Created attachment 400694 [details]
Stack trace + debug from raid5.c

Here we will find the raid5.c.* files compiled with the EXTRA_CFLAGS : https://www.dropbox.com/s/jpdjc660uoyqvi0/raid5.c_star.tar.bz2?dl=0

And also attached is the stack trace with frame pointer enabled.
Comment 3 PaX Team 2015-04-07 18:33:56 UTC
can you give the following patch a try:

--- a/drivers/md/raid5.c      2015-03-08 00:50:15.749728367 +0100
+++ b/drivers/md/raid5.c  2015-04-07 20:32:12.559605536 +0200
@@ -950,14 +950,14 @@
        struct bio_vec bvl;
        struct bvec_iter iter;
        struct page *bio_page;
-       int page_offset;
+       s64 page_offset;
        struct async_submit_ctl submit;
        enum async_tx_flags flags = 0;

        if (bio->bi_iter.bi_sector >= sector)
-               page_offset = (signed)(bio->bi_iter.bi_sector - sector) * 512;
+               page_offset = (s64)(bio->bi_iter.bi_sector - sector) * 512;
        else
-               page_offset = (signed)(sector - bio->bi_iter.bi_sector) * -512;
+               page_offset = (s64)(sector - bio->bi_iter.bi_sector) * -512;

        if (frombio)
                flags |= ASYNC_TX_FENCE;
Comment 4 jack_mort 2015-04-08 20:44:25 UTC
I couldn't catch the error (I wasn't under ssh, only console) but, with the patch I got the same issue. 
I can try to get the error later if you need.
Comment 5 PaX Team 2015-04-08 20:51:39 UTC
(In reply to jack_mort from comment #4)
> I couldn't catch the error (I wasn't under ssh, only console) but, with the
> patch I got the same issue. 

it can't be the same problem as with the patch that place doesn't get instrumented anymore, but i can confirm it if you generate and upload a new set of raid5.c.* files.

> I can try to get the error later if you need.

yes please ;).
Comment 6 jack_mort 2015-04-11 08:32:55 UTC
Created attachment 401040 [details]
Stack trace from 3.19.3-r1

Hi,
I've added new stack trace from 3.19.3-r1, and here are the raid5.c* files :

https://www.dropbox.com/s/bnwiku9tyfziahy/raid5.c_star.tar.bz2?dl=0

It's hard to test, because it's overstressing my raid array : as soon as I get the infamous size overflow, raid detects an error in the array and starts a rebuild :-/
Comment 7 PaX Team 2015-04-12 17:12:12 UTC
sorry for your troubles but gcc induced integer overflows are hard to handle... in any case, if you feel like doing one (hopefully) last test, can you revert the previous patch and apply this one instead please:

--- a/drivers/md/raid5.c  2015-03-18 15:21:50.408349253 +0100
+++ b/drivers/md/raid5.c  2015-04-12 18:02:35.037337098 +0200
@@ -950,23 +950,23 @@
        struct bio_vec bvl;
        struct bvec_iter iter;
        struct page *bio_page;
-       int page_offset;
+       long page_offset;
        struct async_submit_ctl submit;
        enum async_tx_flags flags = 0;

        if (bio->bi_iter.bi_sector >= sector)
-               page_offset = (signed)(bio->bi_iter.bi_sector - sector) * 512;
+               page_offset = (long)(bio->bi_iter.bi_sector - sector) * 512;
        else
-               page_offset = (signed)(sector - bio->bi_iter.bi_sector) * -512;
+               page_offset = (long)(sector - bio->bi_iter.bi_sector) * -512;

        if (frombio)
                flags |= ASYNC_TX_FENCE;
        init_async_submit(&submit, flags, tx, NULL, NULL, NULL);

        bio_for_each_segment(bvl, bio, iter) {
-               int len = bvl.bv_len;
-               int clen;
-               int b_offset = 0;
+               long len = bvl.bv_len;
+               long clen;
+               long b_offset = 0;

                if (page_offset < 0) {
                        b_offset = -page_offset;
Comment 8 jack_mort 2015-04-16 12:08:07 UTC
Sorry, I had no time to test the new patch. I'll try to test by the end of the week.
Comment 9 Anthony Basile gentoo-dev 2015-04-16 20:45:47 UTC
(In reply to jack_mort from comment #8)
> Sorry, I had no time to test the new patch. I'll try to test by the end of
> the week.

I yanked 3.19.3 from the tree.  Can you try with 3.19.4.
Comment 10 dB 2015-04-21 08:51:54 UTC
I also had this same problem on 3.19.3. I did not get a chance to try 3.19.4, but I don't seem to be experiencing the problem in 3.19.5, so I'm not sure where or if it really got fixed.
Comment 11 PaX Team 2015-04-21 09:43:59 UTC
(In reply to dB from comment #10)
> I also had this same problem on 3.19.3. I did not get a chance to try
> 3.19.4, but I don't seem to be experiencing the problem in 3.19.5, so I'm
> not sure where or if it really got fixed.

i already released the fix in a later patch for 3.19.3 so that would explain it ;). if others who experienced the issue can also confirm it then we can close this bug.
Comment 12 jack_mort 2015-04-21 19:04:05 UTC
I may have failed something but for now, I can't boot  3.19.5... Back to 3.18.9 to rebuild a fresh 3.19.5 and get another try.
Comment 13 jack_mort 2015-04-26 19:27:02 UTC
OK, I could run another test, and 3.19.5 booted fine... with no size overflow after a few hours uptime !
It seems to be fixed for me, thanks for the support :-)
Comment 14 Anthony Basile gentoo-dev 2015-05-10 14:24:00 UTC
(In reply to jack_mort from comment #13)
> OK, I could run another test, and 3.19.5 booted fine... with no size
> overflow after a few hours uptime !
> It seems to be fixed for me, thanks for the support :-)

Thanks.