Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 117656 - Kernel Bug Report: XFS + Serpent Cipher = BUG soft lockup detected on CPU#1!
Summary: Kernel Bug Report: XFS + Serpent Cipher = BUG soft lockup detected on CPU#1!
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: http://forums.gentoo.org/viewtopic-p-...
Whiteboard: linux-2.6.16
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-03 12:57 UTC by Wes L
Modified: 2006-03-20 13:05 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
Bug report in recommned kernel bugreport format (xfsandserpent.txt,30.77 KB, text/plain)
2006-01-03 14:46 UTC, Wes L
Details
Current kernel make options (.alpha_10.txt,38.62 KB, text/plain)
2006-01-03 14:58 UTC, Wes L
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Wes L 2006-01-03 12:57:37 UTC
steps to reproduce:
    cryptsetup -c serpent -h sha512 create disk1 /dev/hde
    mkfs.xfs /dev/mapper/disk1
    mount /dev/mapper/disk1 /mnt/disk1
    cp (very large directory) to /mnt/disk1
    10-15 seconds later: 

Jan  3 09:36:23 alpha BUG: soft lockup detected on CPU#1!
Jan  3 09:36:23 alpha
Jan  3 09:36:23 alpha Pid: 7049, comm:              pdflush
Jan  3 09:36:23 alpha EIP: 0060:[<c0278aa5>] CPU: 1
Jan  3 09:36:23 alpha EIP is at cbc_process_encrypt+0x5d/0x87
Jan  3 09:36:23 alpha EFLAGS: 00000212    Not tainted  (2.6.15)
Jan  3 09:36:23 alpha EAX: a604ac26 EBX: 00000010 ECX: 00000004 EDX: fffab0d0
Jan  3 09:36:23 alpha ESI: fffab0d0 EDI: f09bea50 EBP: f09bea50 DS: 007b ES: 007b
Jan  3 09:36:23 alpha CR0: 8005003b CR2: b777c000 CR3: 37409480 CR4: 000006f0
Jan  3 09:36:23 alpha [<c028b32c>] serpent_encrypt+0x0/0x154b
Jan  3 09:36:23 alpha [<c0278eaa>] xor_128+0x0/0x1a
Jan  3 09:36:23 alpha [<c02788ef>] crypt+0x125/0x1d3
Jan  3 09:36:23 alpha [<c0139422>] bad_range+0x1d/0x29
Jan  3 09:36:23 alpha [<c0278a40>] crypt_iv_unaligned+0xa3/0xab
Jan  3 09:36:23 alpha [<c0278d0f>] cbc_encrypt_iv+0x38/0x3d
Jan  3 09:36:23 alpha [<c028b32c>] serpent_encrypt+0x0/0x154b
Jan  3 09:36:23 alpha [<c0278a48>] cbc_process_encrypt+0x0/0x87
Jan  3 09:36:23 alpha [<c03b47f7>] crypt_convert+0x18b/0x261
Jan  3 09:36:23 alpha [<c0139f0b>] get_page_from_freelist+0x86/0x9a
Jan  3 09:36:23 alpha [<c0138d41>] mempool_alloc+0x1e/0xbd
Jan  3 09:36:23 alpha [<c03b51cf>] crypt_map+0xe3/0x21a
Jan  3 09:36:23 alpha [<c0153d91>] bio_clone+0x93/0xab
Jan  3 09:36:23 alpha [<c03ae8ea>] __map_bio+0x35/0xb2
Jan  3 09:36:23 alpha [<c03aeb05>] __clone_and_map+0xc0/0x2c3
Jan  3 09:36:23 alpha [<c0138d41>] mempool_alloc+0x1e/0xbd
Jan  3 09:36:23 alpha [<c03aeda0>] __split_bio+0x98/0x102
Jan  3 09:36:23 alpha [<c03aee7b>] dm_request+0x71/0x85
Jan  3 09:36:23 alpha [<c029a410>] generic_make_request+0xec/0xfc
Jan  3 09:36:23 alpha [<c0138d41>] mempool_alloc+0x1e/0xbd
Jan  3 09:36:23 alpha [<c029a4bb>] submit_bio+0x9b/0xa3
Jan  3 09:36:23 alpha [<c0153bc5>] bio_alloc_bioset+0x106/0x165
Jan  3 09:36:23 alpha [<c01535d4>] submit_bh+0x130/0x15b
Jan  3 09:36:23 alpha [<c026ab99>] xfs_submit_page+0x86/0xa4
Jan  3 09:36:23 alpha [<c026ada4>] xfs_convert_page+0x1ed/0x201
Jan  3 09:36:23 alpha [<c026adf1>] xfs_cluster_write+0x39/0x45
Jan  3 09:36:23 alpha [<c026b2da>] xfs_page_state_convert+0x4dd/0x52e
Jan  3 09:36:23 alpha [<c01366d9>] find_get_pages_tag+0x2a/0x63
Jan  3 09:36:23 alpha [<c026b86c>] linvfs_writepage+0x91/0xc6
Jan  3 09:36:23 alpha [<c016d584>] mpage_writepages+0x1a5/0x2fe
Jan  3 09:36:23 alpha [<c026b7db>] linvfs_writepage+0x0/0xc6
Jan  3 09:36:23 alpha [<c016be87>] __sync_single_inode+0x5e/0x1ba
Jan  3 09:36:23 alpha [<c016c11a>] __writeback_single_inode+0x137/0x13f
Jan  3 09:36:23 alpha [<c0260782>] xfs_trans_first_ail+0xe/0x16
Jan  3 09:36:23 alpha [<c025463f>] xfs_log_need_covered+0x56/0x86
Jan  3 09:36:23 alpha [<c03b0970>] dm_table_any_congested+0xd/0x47
Jan  3 09:36:23 alpha [<c016c2cc>] sync_sb_inodes+0x1aa/0x271
Jan  3 09:36:23 alpha [<c013bc41>] pdflush+0x0/0x2d
Jan  3 09:36:23 alpha [<c016c40b>] writeback_inodes+0x78/0xc7
Jan  3 09:36:23 alpha [<c013b4f7>] wb_kupdate+0x92/0xf7
Jan  3 09:36:23 alpha [<c013bbaf>] __pdflush+0xe3/0x175
Jan  3 09:36:23 alpha [<c013bc69>] pdflush+0x28/0x2d
Jan  3 09:36:23 alpha [<c013b465>] wb_kupdate+0x0/0xf7
Jan  3 09:36:23 alpha [<c012b257>] kthread+0x75/0x9d
Jan  3 09:36:23 alpha [<c012b1e2>] kthread+0x0/0x9d
Jan  3 09:36:23 alpha [<c0100e79>] kernel_thread_helper+0x5/0xb

Bug reproduceable on 2 servers. Tested on multiple kernels.
Problem goes away when you change either the cipher type or filesystem type:
    Serpent + XFS = BUG
    Serpent + Ext3 = OK
    AES + XFS = OK
Comment 1 Jakub Moc (RETIRED) gentoo-dev 2006-01-03 13:58:29 UTC
emerge --info missing, we really need to know on which kernels does this occur, and also if this bug is reproducible with latest vanilla-sources (2.6.15 at the moment).
Comment 2 Wes L 2006-01-03 14:46:48 UTC
Created attachment 76110 [details]
Bug report in recommned kernel bugreport format
Comment 3 Wes L 2006-01-03 14:58:49 UTC
Created attachment 76111 [details]
Current kernel make options
Comment 4 Wes L 2006-01-03 15:04:12 UTC
here's the emerge info: 
Portage 2.0.53 (default-linux/x86/2005.1, gcc-3.3.6, glibc-2.3.5-r2, 2.6.15 i686)
=================================================================
System uname: 2.6.15 i686 Intel(R) Pentium(R) III CPU family      1133MHz
Gentoo Base System version 1.6.13
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.3 [enabled]
dev-lang/python:     2.3.5-r2, 2.4.2
sys-apps/sandbox:    1.2.12
sys-devel/autoconf:  2.13, 2.59-r6
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1
sys-devel/binutils:  2.16.1
sys-devel/libtool:   1.5.20
virtual/os-headers:  2.6.11-r2
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-Os -march=pentium3 -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS=""
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig ccache distcc distlocks sandbox sfperms strict"
GENTOO_MIRRORS="http://distfiles.gentoo.org http://distro.ibiblio.org/pub/linux/distributions/gentoo"
MAKEOPTS="-j5"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="x86 X apm avi berkdb bitmap-fonts bzip2 crypt cups eds emboss encode expat foomaticdb fortran gdbm gif gpm gstreamer gtk gtk2 imlib jpeg kerberos lcms ldap libg++ libwww mad mikmod mmx mng motif mp3 mpeg ncurses nls nptl ogg oggvorbis opengl oss pam pdflib perl png python qt quicktime readline samba sdl spell sse ssl tcpd tiff truetype truetype-fonts type1-fonts udev vorbis winbind xml2 xmms xv zlib userland_GNU kernel_linux elibc_glibc"
Unset:  ASFLAGS, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS, PORTDIR_OVERLAY

As stated the following kernels have been tested by myself on two seperate machines: 
lrwxrwxrwx   1 root root   13 Jan  3 08:58 linux -> linux-2.6.15/
drwxr-xr-x  19 root root 4096 Jan  3 00:33 linux-2.6.14-gentoo-r5
drwxr-xr-x  19 root root 4096 Jan  3 08:47 linux-2.6.14.2
drwxr-xr-x  20 root root 4096 Jan  3 11:12 linux-2.6.15
All exhibit the BUG. 

As a side note, someone replied to my forum posting stating that this isn't that big of an issue, and can be expected.  I just want to make sure that when the system log says BUG i can safely ingore it.
Adding Preempt the big kernel lock to the kernel has no effect.
http://forums.gentoo.org/viewtopic.php?p=3002561#3002561

I have attached some more system info in the kernel bugreport format.
As listed the bug will jump from cpu0 to cpu1 randomly.

If this is a benign error, is there any way i can stop it from flooding my systemlog (30 line bug statement every 1-2secs is a bit much).
Comment 5 Wes L 2006-01-03 15:41:55 UTC
After messing around with more kernel tweaking, If i enable Preempt the Big Kernel Lock AND Premptible Kernel (Low-Latency Desktop) The errors dissapear.
Just having Preempt the big kernel lock has no effect.

It was set to No Prempt the Big Lock, and No Forced Premption (Server)

This is a server, I assume that having the kernel prempt everything i'm losing some performance? (this server mainly serves files)

Also, i'm now noticing that disk write througput is up from 9meg/sec previously to 15-22meg/sec with preemption on.  Just from this exercise it would appear that low latency desktop is better for server use (heavy disk i/o)?
Comment 6 Daniel Drake (RETIRED) gentoo-dev 2006-01-20 15:47:56 UTC
Please test the latest development kernel (2.6.16-rc1) and see if it is reproducible with the config known to cause problems.
Comment 7 Wes L 2006-01-21 13:24:11 UTC
Loaded kernel 2.6.16-rc1
Linux alpha 2.6.16-rc1 #1 SMP Sat Jan 21 12:25:14 MST 2006 i686 Intel(R) Pentium(R) III CPU family      1133MHz GenuineIntel GNU/Linux

Tested without preemption and no premept the big kernel lock.

No bug, tested with over 3.9gig of data without incident.
Although with the 2.6.16-rc1 iptables refuses to work properly.

If i may inquire, what was the fix? (what subsystem was causing the bug to appear)
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2006-01-25 05:52:00 UTC
Not sure. If you really want to find out where the problem lies you could do a bisection : http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

I'll keep my eyes open for similar reports and possible fixes for this which we could backport to 2.6.15.
Comment 9 Daniel Drake (RETIRED) gentoo-dev 2006-03-20 13:05:57 UTC
gentoo-sources-2.6.16 is now in portage.