The ipw2200 driver included in gentoo-sources-2.6.21 (version 1.2.0) has problems with the new "Tickless" feature of the kernel. With it the download/upload speed over ssh goes down to ~700K (wile before it was ~2.4Mb), and while under heavy load of traffic the driver reports this error: ipw2200: Firmware error detected. Restarting. and the net becomes stalled (need to stop and restart the init script). The problem vanishes when I remove that kernel feature. I use ipw2200-firmware-3.0 and wpa_supplicant-0.5.7 My emerge --info: Portage 2.1.2.9 (default-linux/x86/2007.0/desktop, gcc-4.1.2, glibc-2.5-r4, 2.6.21-gentoo-r4 i686) ================================================================= System uname: 2.6.21-gentoo-r4 i686 Intel(R) Pentium(R) M processor 1.86GHz Gentoo Base System release 1.12.9 Timestamp of tree: Tue, 24 Jul 2007 08:50:01 +0000 dev-lang/python: 2.4.4-r4 dev-python/pycrypto: 2.0.1-r5 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.61 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.17 sys-devel/gcc-config: 1.3.16 sys-devel/libtool: 1.5.23b virtual/os-headers: 2.6.21 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=i686 -mmmx -msse -msse2 -fomit-frame-pointer -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config" CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c" CXXFLAGS="-O2 -march=i686 -mmmx -msse -msse2 -fomit-frame-pointer -pipe -fvisibility-inlines-hidden" DISTDIR="/usr/portage/distfiles" FEATURES="distlocks metadata-transfer parallel-fetch sandbox sfperms strict userfetch" GENTOO_MIRRORS="http://mirror.switch.ch/ftp/mirror/gentoo/ http://www.die.unipd.it/pub/Linux/distributions/gentoo-sources/ http://trumpetti.atm.tut.fi/gentoo/ ftp://mirror.switch.ch/mirror/gentoo/" LANG="it_IT@euro" LC_ALL="it_IT@euro" LDFLAGS="-Wl,-O1" LINGUAS="it" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*" PORTAGE_TMPDIR="/var/portage_tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage" USE="X a52 aac acpi alsa audiofile bash-completion berkdb bitmap-fonts bzip2 cairo caps cdr cli cracklib crypt cups dbus dga dri dv dvd dvdr dvdread emboss encode evo fam fbcon ffmpeg firefox flac fortran gdbm gif gnutls gpm gtk hal iconv idn ieee1394 ipv6 isdnlog jpeg kdeenablefinal libg++ mad midi mikmod mmap mmx mp3 mpeg mudflap ncurses nls nptl nptlonly offensive ogg opengl openmp pam pcmcia pcre perl png ppds pppd python qt3support quicktime readline reflection samba sdl session spell spl sse sse2 ssl svg tcpd theora tiff truetype truetype-fonts type1-fonts unicode vorbis wifi win32codecs x264 x86 xml xorg xv xvid zlib" ALSA_CARDS="intel8x0" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INPUT_DEVICES="evdev keyboard mouse synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="it" USERLAND="GNU" VIDEO_CARDS="fbdev radeon vesa vga" Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, MAKEOPTS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS Reproducible: Always
Can you post your full dmesg and can you post your default timeout value? cat /sys/class/firmware/timeout
Created attachment 126071 [details] dmesg output this is my timeout: ale@heavensdoor ~ $ cat /sys/class/firmware/timeout 60
Thanks for the dmesg. For more debug info could you do the following: Recompile the modules with CONFIG_IPW2200_DEBUG turned on. This will enable full debugging output in IPW2200 module. Device Drivers->Network device support ->Network device support->Wireless LAN Please set the debug level to 0x43fff to capture the full firmware status log and provide the dmesg output capturing the log. You can do this via the module parameter 'debug': % modprobe ipw2200 debug=0x43fff
Created attachment 126077 [details] kern.log.gz I did as you requested but the debug output from the driver fills the logs (and dmesg too). I posted here my whole kern.log, the error happened near the end and I removed the module and made a copy of the log. Hope it helps but it is a lot of debug information!
Created attachment 126085 [details] firware error dump Maybe this is the error you were searching for
OK, thanks for this testing. Could you test with gentoo-sources-2.6.22-r1 with tickless enabled/disabled. Also, you can enable/disable tickless by passing kernel parameter nohz=off. (See Documentation/kernel-parameters.txt for more info.) I would also be curious if vanilla-sources-2.6.23-r1 gave you the same results.
I'll try if I have time this evening, else you'll have to wait a couple of weeks 'cause I'm going on holidays tomorrow! :-)
The problem persists even with 2.6.22 (gentoo-sources-2.6.22-r1). Wen i run the kernel with nohz=off the bug vanishes again (I was able to download via scp a gigabyte and more without freezing the network). Remember that you have to put your network under heavy traffic to see the problem. Typically I have to download a gigabyte of data before it starts to freeze. When freezed the first time it happens again (if I restart the network) after a short time (few megabytes). Maybe it is some kind of buffer that gets filled or whatever... I noticed it the first time because I happen to watch some films stored on a fileserver with my notebook (connected through a wireless lan).
can you post you gentoo-sources-2.6.21-r4 config file? It's located under /usr/src/linux/.config I have a system almost the same as you and I can get the firmware to reset, but it doesn't drop the speed down and I don't need to restart the networking scripts either :S
Created attachment 126193 [details] working .config This is my WORKING .config. Note that # CONFIG_NO_HZ is not set The slowdown bug has disappeared some times ago (sorry for not pointing out) during some kernel recompiling... don't know why and what caused it. I noticed that sometimes the network starts again by itself but it requires a long (and random) time so I just restart the init script (simply it needs do do again the wpa handshake)
so the only difference from one kernel (where you have no problems, with NOHZ not set) to a kernel with problems is just the NOHZ option? I really don't have any noticeable problem but the "ipw2200: Firmware error detected. Restarting." message because I transfered a file of 1.6GB via wireless card that was heavy loaded and I transfered the file in 40min :) around 520kB/s anyway, the only difference is the NOHZ option right?
yes you are right
ok, one more output just to check something. can you attach the contents of /proc/timer_list please.
This is my /proc/timer_list Timer List Version: v0.3 HRTIMER_MAX_CLOCK_BASES: 2 now at 54654692584 nsecs cpu: 0 clock 0: .index: 0 .resolution: 1 nsecs .get_time: ktime_get_real .offset: 1186347567936743394 nsecs active timers: clock 1: .index: 1 .resolution: 1 nsecs .get_time: ktime_get .offset: 0 nsecs active timers: #0: <f6231f00>, tick_sched_timer, S:01 # expires at 54656000000 nsecs [in 1307416 nsecs] #1: <f6231f00>, it_real_fn, S:01 # expires at 54671471041 nsecs [in 16778457 nsecs] #2: <f6231f00>, hrtimer_wakeup, S:01 # expires at 54744805567 nsecs [in 90112983 nsecs] #3: <f6231f00>, hrtimer_wakeup, S:01 # expires at 54744811993 nsecs [in 90119409 nsecs] #4: <f6231f00>, hrtimer_wakeup, S:01 # expires at 54744817860 nsecs [in 90125276 nsecs] #5: <f6231f00>, it_real_fn, S:01 # expires at 54818748930 nsecs [in 164056346 nsecs] #6: <f6231f00>, hrtimer_wakeup, S:01 # expires at 81916475885 nsecs [in 27261783301 nsecs] #7: <f6231f00>, hrtimer_wakeup, S:01 # expires at 93570043818 nsecs [in 38915351234 nsecs] .expires_next : 54656000000 nsecs .hres_active : 1 .nr_events : 8045 .nohz_mode : 2 .idle_tick : 54652000000 nsecs .tick_stopped : 0 .idle_jiffies : 4294905958 .idle_calls : 9451 .idle_sleeps : 4575 .idle_entrytime : 54648078795 nsecs .idle_sleeptime : 27236941034 nsecs .last_jiffies : 4294905958 .next_jiffies : 4294905960 .idle_expires : 54656000000 nsecs jiffies: 4294905959 Tick Device: mode: 1 Clock Event Device: <NULL> tick_broadcast_mask: 00000000 tick_broadcast_oneshot_mask: 00000000 Tick Device: mode: 1 Clock Event Device: pit max_delta_ns: 27461866 min_delta_ns: 12571 mult: 5124677 shift: 32 mode: 3 next_event: 54656000000 nsecs set_next_event: pit_next_event set_mode: init_pit_timer event_handler: hrtimer_interrupt
Please reopen if you see this again but from your comment #10, it appears you no longer see the slowdown.
Yes, as I said the slowdown has disappeared (maybe it was only a coincidence... don't know if it's related to the bug). The real bug isn't about the slowdown, it's about the "ipw2200: Firmware error detected. Restarting." error and the consequent freeze of the network under heavy load. Typically it happens when I try to download more than 1 Gb of data (yes it is much but not that much to say it is a rare event :-) )
This is a known issue and has always existed in this setup. It's a hardware/firmware bug which only Intel can fix (and I presume they can't -- they do like to joke about how often the firmware reboots on this hardware). I suggest you write to the ipw2200 mailing list for more info -- if any patches do come out of this, we'll consider them.