Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 328483 - =sys-kernel/gentoo-sources-2.6.34-r1 ksoftirqd runs at 30%
Summary: =sys-kernel/gentoo-sources-2.6.34-r1 ksoftirqd runs at 30%
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: High normal (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-07-15 21:54 UTC by Adam Randall
Modified: 2010-07-19 19:01 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
linux-2.6.34-gentoo-r1 .config (r210.linux-2.6.34-gentoo-r1.config,67.54 KB, text/plain)
2010-07-15 21:58 UTC, Adam Randall
Details
Output of dmesg running on 2.6.34-gentoo-r1 (r210.dmesg.txt,45.49 KB, text/plain)
2010-07-15 22:01 UTC, Adam Randall
Details
/proc/interrupts samples, 300s, non-tickless (interrupts_nodyntick_300s.txt,11.42 KB, text/plain)
2010-07-19 17:58 UTC, Adam Randall
Details
/proc/interrupts samples, 300s, tickless (interrupts_dyntick_300s.txt,11.42 KB, text/plain)
2010-07-19 17:59 UTC, Adam Randall
Details
patch file that fixes the ksoftirqd CPU issue (ipmi_si_intf.c.patch,354 bytes, patch)
2010-07-19 18:43 UTC, Adam Randall
Details | Diff
patch file that fixes the ksoftirqd CPU issue (ipmi_si_intf.c.patch,358 bytes, patch)
2010-07-19 19:00 UTC, Adam Randall
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Randall 2010-07-15 21:54:26 UTC
After building a new server, a PowerEdge R210 with a NetExtremeII ethernet card), I noticed that ksoftirqd was running at a constant 30%. Downgrading to =gentoo-sources-2.6.32-r7 with the same configuration method (via make menuconfig, never copying .config or using oldconfig) resolves this issue.

This seems to be related to the bnx2 modules for the NetExtremeII card, though I have not confirmed that. Additional information I found on the Internet also mentions something about a tickless system causing this.

Reproducible: Always

Steps to Reproduce:
1. Get a server with a NetExremeII card
2. Configure and install 2.6.34-gentoo-r1
3. Run top to see that ksoftirqd is taking a lot of CPU




# emerge --info
Portage 2.1.8.3 (default/linux/amd64/10.0, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-gentoo-r1 x86_64)
=================================================================
System uname: Linux-2.6.34-gentoo-r1-x86_64-Intel-R-_Xeon-R-_CPU_X3450_@_2.67GHz-with-gentoo-1.12.13
Timestamp of tree: Thu, 15 Jul 2010 04:45:03 +0000
app-shells/bash:     4.0_p37
dev-lang/python:     2.6.5-r2, 3.1.2-r3
sys-apps/baselayout: 1.12.13
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.65
sys-devel/automake:  1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.1.2, 4.4.3-r2
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6b
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -O2 -pipe -fomit-frame-pointer"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/eselect/postgresql /etc/fonts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ext-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-march=native -O2 -pipe -fomit-frame-pointer"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS="--with-bdeps y"
FEATURES="assume-digests distlocks fail-clean fixpackages news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch"
GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://gentoo.cs.uni.edu/ http://mirror.usu.edu/mirrors/gentoo/"
LDFLAGS="-Wl,-O1"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
SYNC="rsync://192.168.0.157/gentoo-portage"
USE="acl amd64 bash-completion berkdb bzip2 cli cracklib crypt ctype curl cxx dri fortran ftp gcj gdbm glibc-compat20 gpm iconv ipv6 mmx modules mudflap multilib ncurses nls nptl nptlonly openmp pam pcre perl pppd python readline reflection samba session snmp spl sse sse2 ssl sysfs tcpd threads unicode vim-syntax xml xorg zip zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nv r128 radeon savage sis tdfx trident vesa via vmware voodoo" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LANG, LC_ALL, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
Comment 1 Adam Randall 2010-07-15 21:58:29 UTC
Created attachment 238937 [details]
linux-2.6.34-gentoo-r1 .config

This is the 2.6.34-gentoo-r1 .config file produced from make menuconfig
Comment 2 Adam Randall 2010-07-15 21:59:33 UTC
Note: I do not know if this affects other architectures as I have only tested this on the R210 in 64-bit mode.
Comment 3 Adam Randall 2010-07-15 22:01:33 UTC
Created attachment 238939 [details]
Output of dmesg running on 2.6.34-gentoo-r1
Comment 4 Adam Randall 2010-07-15 22:05:22 UTC
If there is anything else that I can provide, or test, please let me know.
Comment 5 Adam Randall 2010-07-15 22:38:59 UTC
I configured and built =gentoo-sources-2.6.33-r2 and it does not have the problem I'm seeing in 2.6.24-r1.
Comment 6 Stefan Behte (RETIRED) gentoo-dev Security 2010-07-16 00:17:16 UTC
Can you try to git-bisect the exact problem? I guess this will have to go upstream.
Comment 7 Adam Randall 2010-07-16 00:18:42 UTC
While doing research, I found this:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/575392

The last post mentions disabling all ipmi_* options. I did have this selected
since I want to watch ipmi stuff. After initially deselecting the IPMI options
that the ksoftirqd issue went away. With further digging I found that just this
option would trigger the ksoftirqd issue: CONFIG_IPMI_SI

Going further, I disabled the CONFIG_NO_HZ option (Processor type and features
/ Tickless System (Dynamic Ticks)) and reselected the IPMI options. This made
the system behave as I expected with ksoftirqd not taking any CPU, and IPMI
still operational.
Comment 8 Adam Randall 2010-07-16 00:29:30 UTC
(In reply to comment #6)
> Can you try to git-bisect the exact problem? I guess this will have to go
> upstream.
> 

Well, from my last post, I think it's some conflict between IPMI and the tickless system. However, it may just be related to my specific network card too.

I have another system that I'll try and replicate this on with a different Xeon processor and NIC (Intel Pro). As for the git stuff, I'm not sure if I can help there. My comfort with GIT is pretty low, sadly.

Adam.
Comment 9 Adam Randall 2010-07-16 20:03:33 UTC
I tested on another development machine, this one a PowerEdge 2850 with this NIC:

06:07.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller (rev 05)

It has 2 older Xeon Nocona processors @ 3GHz.

The same issue occurs if CONFIG_IPMI_SI and CONFIG_NO_HZ are both enabled, causing ksoftirqd to run between 20% and 30% CPU, constantly.
Comment 10 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 14:20:14 UTC
Can we have a copy of your /proc/interrupts?
Comment 11 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 14:21:58 UTC
Sorry for the double post:
Actually, can you give us a couple of copies of your /proc/interrupts taken every 5 seconds or so?
Thanks.
Comment 12 Adam Randall 2010-07-19 14:30:24 UTC
I'll provide what you requested later today. I'll need to rebuild the kernel first.
Comment 13 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 14:36:29 UTC
(In reply to comment #12)
> I'll provide what you requested later today. I'll need to rebuild the kernel
> first.
> 

If you are rebuilding your kernel, you might want to read up on Kernel Profiling and enable the related options. It usually helps in CPU load debugging :)
Comment 14 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 15:04:47 UTC
Also, did you try the patch mentioned [1] in the kernel.org bug report [2] linked on your launchpad.net report?

[1]: https://bugzilla.kernel.org/attachment.cgi?id=26732
[2]: https://bugzilla.kernel.org/show_bug.cgi?id=16147
Comment 15 Adam Randall 2010-07-19 17:58:07 UTC
Created attachment 239413 [details]
/proc/interrupts samples, 300s, non-tickless

This is the non-tickless /proc/interrupt samples, collected every 5 minutes. Kernel is 2.6.34-gentoo-r1, and CONFIG_IPMI_SI is enabled.
Comment 16 Adam Randall 2010-07-19 17:59:08 UTC
Created attachment 239415 [details]
/proc/interrupts samples, 300s, tickless

This is the tickless /proc/interrupt samples, collected every 5 minutes. Kernel is 2.6.34-gentoo-r1, and both CONFIG_NO_HZ and CONFIG_IPMI_SI are enabled.
Comment 17 Adam Randall 2010-07-19 18:00:51 UTC
I haven't applied any patches to this kernel, just emerged gentoo-sources and ran make menuconfig. I can look into building the kernel with the patch as described in the thread.
Comment 18 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 18:08:42 UTC
(In reply to comment #17)
> I haven't applied any patches to this kernel, just emerged gentoo-sources and
> ran make menuconfig. I can look into building the kernel with the patch as
> described in the thread.
> 

I think it would be a good idea. 2.6.34-r1 does not have that patch applied and it seems like it fits your problem well.

There are many resources on the internet on how to apply kernel patches so you should have no problems.
Comment 19 Adam Randall 2010-07-19 18:24:05 UTC
Yep, that certainly fixes it. Not sure if that .diff file in the second link would have worked since the pathing seemed to be in regards to a git installation, but the change was simple enough. Should I create a diff for the change I did (chaning 0 to 1) and upload it here?
Comment 20 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 18:25:51 UTC
(In reply to comment #19)
> Yep, that certainly fixes it. Not sure if that .diff file in the second link
> would have worked since the pathing seemed to be in regards to a git
> installation, but the change was simple enough. Should I create a diff for the
> change I did (chaning 0 to 1) and upload it here?
> 

I don't think it's needed.
So it's over and done? The patch fixed the issue?
Comment 21 Adam Randall 2010-07-19 18:41:44 UTC
Yes, the CPU is sitting at 0%, give or take a 1% or two from kipmid, which is normal.

I'm going to end up making a patch file that I can use with my other systems, however, since this one:

https://bugzilla.kernel.org/attachment.cgi?id=26732

Is not applicable to 2.6.34-gentoo-r1

Hopefully, this will help anyone else out that might run into the same issue.
Comment 22 Adam Randall 2010-07-19 18:43:58 UTC
Created attachment 239421 [details, diff]
patch file that fixes the ksoftirqd CPU issue

This patch is based on https://bugzilla.kernel.org/attachment.cgi?id=26732

It allows IPMI to be enabled in the tickless kernel and not cause ksoftirqd to take upwards to 30% CPU.
Comment 23 George Kadianakis (RETIRED) gentoo-dev 2010-07-19 18:52:39 UTC
Alright, I guess I can close this bug then.
Thanks for the report and see you around :)
Comment 24 Adam Randall 2010-07-19 19:00:19 UTC
Created attachment 239425 [details, diff]
patch file that fixes the ksoftirqd CPU issue

Replaces original patch file I uploaded
Comment 25 Adam Randall 2010-07-19 19:01:41 UTC
Thank you for resolving this. I've amended my patch to work like other kernel patches. It's my first one, so be gentle :)