Last week I upgraded from gentoo-sources-2.6.16-r13 to 2.6.17-r4. I was ripping a CD while booted in the 2.6.17 kernel, and I noticed that it was ripping at 1.2x speed instead of its usual 7.5-9.0x speed. So, I booted into the 2.6.16 kernel and my ripping speeds were back their normal rates. A further test where I copied some 2 GB files off a DVD showed a copy time much longer in the 2.6.17 kernel than the 2.6.16 kernel.
Portage 2.1-r1 (default-linux/amd64/2006.0, gcc-3.4.6, glibc-2.3.6-r4, 2.6.16-gentoo-r13 x86_64)
System uname: 2.6.16-gentoo-r13 x86_64 AMD Athlon(tm) 64 Processor 3200+
Gentoo Base System version 1.6.15
ccache version 2.3 [enabled]
app-admin/eselect-compiler: [Not Present]
dev-lang/python: 2.3.5-r2, 2.4.3-r1
dev-util/confcache: [Not Present]
sys-devel/autoconf: 2.13, 2.59-r7
sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2
CFLAGS="-O2 -march=k8 -pipe -fweb -ftracer -fomit-frame-pointer"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/"
CONFIG_PROTECT_MASK="/etc/env.d /etc/gconf /etc/revdep-rebuild /etc/terminfo"
CXXFLAGS="-O2 -march=k8 -pipe -fweb -ftracer -fomit-frame-pointer"
FEATURES="autoconfig ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'"
USE="amd64 X a52 aac alsa avi berkdb bitmap-fonts bzip2 cdr cli crypt cups dbus dlloader dri dvd dvdr emboss encode ffmpeg firefox flac foomaticdb fortran gif gnome gpm gstreamer gtk gtk2 hal imap imlib ipv6 isdnlog java joystick jpeg lzw lzw-tiff mad matroska mp3 mpeg musepack ncurses nls nptl nptlonly nsplugin nvidia ogg opengl pam pcre pdflib perl png ppds pppd python qt3 qt4 quicktime readline reflection sdl session sndfile spell spl ssl tcpd tiff truetype truetype-fonts type1-fonts unicode usb vorbis xorg xpm xprint xv zlib elibc_glibc input_devices_keyboard input_devices_mouse input_devices_evdev kernel_linux linguas_en userland_GNU video_cards_nvidia"
Unset: CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LDFLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Created attachment 93024 [details]
lspci output on my system
Created attachment 93025 [details]
2.6.16-r13 dmesg output
Created attachment 93026 [details]
2.6.17-r4 dmesg output
Can you reproduce the bug using vanilla-sources-2.6.17?
I was in fact able to reproduce the same problem in vanilla-sources-184.108.40.206.
Is this reproducible on the latest development kernel, currently 2.6.18-rc4?
Bug is verfied to occur in vanilla-sources-2.16.18_rc4
Would you be able to take some measurements so that we have numbers to work with? Time how long it takes to copy a file on 2.6.16, and then time how long it takes to copy the same file on 2.6.17
At that point, you need to file a bug upstream at http://bugzilla.kernel.org clearly stating that 2.6.16 worked ok *and* 2.6.18-rc4 is still affected.
If you have a lot of time and patience, you could find the exact patch which introduces this bug by testing approximately 13 kernels. See http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
Use 2.6.16 as good and 2.6.17 as bad.
Given that this is really time consuming it might be worth simply reporting the bug upstream first, and saving this as a last resort method if no solutions are immediately obvious after say 1 week.
*sigh* After trying to get some hard numbers on this, it seems that every application except the cd ripper I use isn't suffering from a slow down. And the great thing about the cd ripper I'm using is that I wrote it. Sound Juicer has no change in performance. As this is obviously an application level bug, does anyone know what might cause this behavior in the ripper? I'm sort of at a loss as to how to fix it...
git bisect says this patch is what is causing the changed behavior
9430d58e34ec3861e1ca72f8e49105b227aad327 is first bad commit
Author: Mike Galbraith <firstname.lastname@example.org>
Date: Wed Mar 22 00:07:33 2006 -0800
[PATCH] sched: remove sleep_avg multiplier
Remove the sleep_avg multiplier. This multiplier was necessary back when
we had 10 seconds of dynamic range in sleep_avg, but now that we only have
one second, it causes that one second to be compressed down to 100ms in
some cases. This is particularly noticeable when compiling a kernel in a
slow NFS mount, and I believe it to be a very likely candidate for other
recently reported network related interactivity problems.
In testing, I can detect no negative impact of this removal.
Signed-off-by: Mike Galbraith <email@example.com>
Acked-by: Ingo Molnar <firstname.lastname@example.org>
Signed-off-by: Andrew Morton <email@example.com>
Signed-off-by: Linus Torvalds <firstname.lastname@example.org>
:040000 040000 28d2d8f53ab7b5dd89e846f2dcc107ce88cb695f 780a13c0f8ba5465db79c668
This has been filed upstream here http://bugzilla.kernel.org/show_bug.cgi?id=7027
will track upstream bug
Not a kernel bug