Bug 218907 - sys-kernel/gentoo-sources-2.6.25-r1 swapper: page allocation failure. order:3, mode:0x4020
Summary: sys-kernel/gentoo-sources-2.6.25-r1 swapper: page allocation failure. order:3...
Product: Gentoo Linux
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
Reported: 2008-04-22 16:28 UTC by Weedy
Modified: 2008-09-16 12:33 UTC (History)
2.6.25 kern.log (60uaYq86.txt,248.13 KB, text/plain)
2008-04-22 16:34 UTC, Weedy
2.6.24 kern.log (lJ3GCm32.txt,32.98 KB, text/plain)
2008-04-22 16:41 UTC, Weedy

Description Weedy 2008-04-22 16:28:22 UTC
The box locks up (I think, networking dies for sure).

Reproducible: Always

Steps to Reproduce:
1. install gentoo-sources-2.6.25-r1
2. start apache/postfix/other crap most web servers run
3. wait
Actual Results:  
Apr 21 02:32:44 Chii swapper: page allocation failure. order:3, mode:0x4020
Apr 21 02:32:44 Chii Pid: 0, comm: swapper Not tainted 2.6.25-gentoo-r1 #1
Apr 21 02:32:44 Chii [<c013ad40>] __alloc_pages+0x29b/0x2ae
Apr 21 02:32:44 Chii [<c014eed4>] __slab_alloc+0x123/0x39b
Apr 21 02:32:44 Chii [<c014fc36>] __kmalloc_track_caller+0x63/0xb2
Apr 21 02:32:44 Chii [<c025632f>] ? __netdev_alloc_skb+0x17/0x34
Apr 21 02:32:44 Chii [<c025632f>] ? __netdev_alloc_skb+0x17/0x34
Apr 21 02:32:44 Chii [<c0255752>] __alloc_skb+0x4a/0x107
Apr 21 02:32:44 Chii [<c025632f>] __netdev_alloc_skb+0x17/0x34
Apr 21 02:32:44 Chii [<f8852124>] e100_rx_alloc_skb+0x1c/0x77 [e100]
Apr 21 02:32:44 Chii [<f8853972>] e100_poll+0x1ce/0x2bd [e100]
Apr 21 02:32:44 Chii [<c025b29b>] net_rx_action+0x69/0x136
Apr 21 02:32:44 Chii [<c011c984>] __do_softirq+0x38/0x7a
Apr 21 02:32:44 Chii [<c0104913>] do_softirq+0x3e/0x71
Apr 21 02:32:44 Chii [<c0134414>] ? handle_fasteoi_irq+0x0/0x7b
Apr 21 02:32:44 Chii [<c011c91d>] irq_exit+0x28/0x57
Apr 21 02:32:44 Chii [<c01049f5>] do_IRQ+0xaf/0xc4
Apr 21 02:32:44 Chii [<c01032c7>] common_interrupt+0x23/0x28
Apr 21 02:32:44 Chii [<c01fe8ed>] ? acpi_processor_idle+0x2a3/0x432
Apr 21 02:32:44 Chii [<c01fe64a>] ? acpi_processor_idle+0x0/0x432
Apr 21 02:32:44 Chii [<c0101712>] cpu_idle+0x4e/0x60
Apr 21 02:32:44 Chii [<c02beeab>] rest_init+0x43/0x45
Apr 21 02:32:44 Chii =======================
Apr 21 02:32:44 Chii Mem-info:
Apr 21 02:32:44 Chii DMA per-cpu:
Apr 21 02:32:44 Chii CPU    0: hi:    0, btch:   1 usd:   0
Apr 21 02:32:44 Chii Normal per-cpu:
Apr 21 02:32:44 Chii CPU    0: hi:  186, btch:  31 usd:  29
Apr 21 02:32:44 Chii HighMem per-cpu:
Apr 21 02:32:44 Chii CPU    0: hi:   18, btch:   3 usd:   0
Apr 21 02:32:44 Chii Active:144380 inactive:64390 dirty:11971 writeback:10 unstable:0
Apr 21 02:32:44 Chii free:24603 slab:7514 mapped:48289 pagetables:560 bounce:0
Apr 21 02:32:44 Chii DMA free:3520kB min:68kB low:84kB high:100kB active:7332kB inactive:56kB present:16256kB pages_scanned:0 all_unreclaimable? no
Apr 21 02:32:44 Chii lowmem_reserve[]: 0 873 936 936
Apr 21 02:32:44 Chii Normal free:94548kB min:3744kB low:4680kB high:5616kB active:513380kB inactive:250740kB present:894080kB pages_scanned:0 all_unreclaimable? no
Apr 21 02:32:44 Chii lowmem_reserve[]: 0 0 505 505
Apr 21 02:32:44 Chii HighMem free:344kB min:128kB low:192kB high:260kB active:56808kB inactive:6764kB present:64708kB pages_scanned:0 all_unreclaimable? no
Apr 21 02:32:44 Chii lowmem_reserve[]: 0 0 0 0
Apr 21 02:32:44 Chii DMA: 36*4kB 34*8kB 30*16kB 0*32kB 1*64kB 0*128kB 0*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 3520kB
Apr 21 02:32:44 Chii Normal: 9847*4kB 5317*8kB 777*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 94548kB
Apr 21 02:32:44 Chii HighMem: 38*4kB 10*8kB 1*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 344kB
Apr 21 02:32:44 Chii 132404 total pagecache pages
Apr 21 02:32:44 Chii Swap cache: add 5426, delete 4918, find 1281/1651
Apr 21 02:32:44 Chii Free swap  = 2139128kB
Apr 21 02:32:44 Chii Total swap = 2144668kB
Apr 21 02:32:44 Chii Free swap:       2139128kB
Apr 21 02:32:44 Chii 245680 pages of RAM
Apr 21 02:32:44 Chii 16304 pages of HIGHMEM
Apr 21 02:32:44 Chii 3088 reserved pages
Apr 21 02:32:44 Chii 190713 pages shared
Apr 21 02:32:44 Chii 508 pages swap cached
Apr 21 02:32:44 Chii 11971 pages dirty
Apr 21 02:32:44 Chii 10 pages writeback
Apr 21 02:32:44 Chii 48289 pages mapped
Apr 21 02:32:44 Chii 7514 pages slab
Apr 21 02:32:44 Chii 560 pages pagetables

Expected Results:  
It should have not died.

Portage 2.1.5_rc6 (hardened/x86/2.6, gcc-4.2.0, glibc-2.7-r2, 2.6.25-gentoo-r1 i686)
System uname: 2.6.25-gentoo-r1 i686 AMD Athlon(tm) 64 Processor 3800+
Timestamp of tree: Tue, 22 Apr 2008 01:16:01 +0000
ccache version 2.4 [enabled]
app-shells/bash:     3.2_p33
dev-lang/python:     2.4.4-r8, 2.5.1-r5
dev-python/pycrypto: 2.0.1-r6
dev-util/ccache:     2.4-r7
sys-apps/baselayout: 1.12.12
sys-devel/autoconf:  2.62
sys-devel/automake:  1.7.9-r1, 1.10.1
sys-devel/gcc-config: 1.4.0-r4
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.25-r1
CFLAGS="-march=native -Os -pipe -fweb -fgcse-after-reload"
CONFIG_PROTECT="/etc /var/bind"
CONFIG_PROTECT_MASK="/etc/env.d /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/terminfo /etc/udev/rules.d"
CXXFLAGS="-march=native -Os -pipe -fweb -fgcse-after-reload"
FEATURES="candy ccache distlocks parallel-fetch sandbox sfperms strict unmerge-orphans userfetch userpriv usersandbox"
LDFLAGS="-Wl,-O1 -Wl,--enable-new-dtags -Wl,--hash-style=gnu -Wl,--sort-common -s"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTDIR_OVERLAY="/usr/local/portage /usr/local/overlays/toolchain/branches/pieworld /usr/portage/local/layman/php-testing /usr/portage/local/layman/webapps-experimental"
USE="3dnow 3dnowext acpi apache2 apic authdaemond bash-completion berkdb bzip2 ccache cgi chroot cjk cli cracklib crypt ctype curl fastbuild force-cgi-redirect fortran gd geoip glibc-omitfp gzip hardened hash iconv imagemagick imap jpeg jpeg2k libwww lm_sensors maildir mailwrapper mcrypt mhash midi mmx mmxext mpeg mtrr multislot mysql mysqli nls nptl nptlonly pam pcre pdo perl pic pie png pni readline sasl session simplexml snmp sockets spl sse sse2 ssl suexec suhosin tcpd threads tiff truetype unicode urandom vda vhost vhosts vim-syntax x86 xml xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1      emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m       maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif status unique_id usertrack version vhost_alias" APACHE2_MPMS="worker" ELIBC="glibc" INPUT_DEVICES="mouse keyboard" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="apm ark chips cirrus cyrix dummy fbdev glint i128 i740 i810 imstt       mach64 mga neomagic nsc nv r128 radeon rendition s3 s3virge savage      siliconmotion sis sisusb tdfx tga trident tseng v4l vesa vga via vmware      voodoo"
Comment 1 Weedy 2008-04-22 16:34:19 UTC
Created attachment 150609 [details]
2.6.25 kern.log
Comment 2 Weedy 2008-04-22 16:41:42 UTC
Created attachment 150610 [details]
2.6.24 kern.log

this is what killed the box back on .24, might be related.
Comment 3 Kim Højgaard-Hansen 2008-04-22 17:47:13 UTC
please make sure that it's reproducible before stating it, that was not what you told me on IRC :) just state it here if it happens again

Comment 4 Weedy 2008-05-14 14:31:14 UTC
Looks like is was a 4k stack issue. Sorry for the noise.
Comment 5 Duane Griffin 2008-05-14 14:36:12 UTC
OK, thanks for the update. Since I'd just finished typing this lot up :), you may want to address the TCP warnings in your logs anyway though:

The "page allocation failure" is caused by using jumbo frames with the e100 driver. It should be non-fatal (expected, even), but you may want to get rid of it anyway by changing the nic's MTU to 1500.

The warnings from your 2.6.24 log are issued when your kernel detects other machines with broken TCP implementations. Perhaps your traffic is going through a dodgy old firewall/router box? These should also be non-fatal, although the FRTO warning is indicating a potentially serious problem. Please try doing the following (as root) and see if it helps:

echo "0" > /proc/sys/net/ipv4/tcp_frto
Comment 6 Weedy 2008-05-14 14:53:45 UTC
I thought jumbo frames were a gigabit thing? Since it's on a 100mbit port I will make sure to force 1500mtu.

I have been setting tcp_frto to 0 in sysctl.conf for a while actually. It has made that shut up.
Comment 7 Duane Griffin 2008-05-14 17:36:43 UTC
Yep, they are a gigabit thing. Strange -- if the driver is using a standard MTU why is it asking for so much memory?
Comment 8 Deniss Gaplevsky 2008-09-16 12:33:53 UTC
i have same problem with 2.6.25-hardened-r4 kernel
Looks like the problem is in e1000 driver - i have 3 another servers with this kernel but with 8139too/skge they work just fine.
Check this thread: