Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 154752 - gentoo-sources-2.6.18-r1 - ehci_hcd module hangs upon hotplug
Summary: gentoo-sources-2.6.18-r1 - ehci_hcd module hangs upon hotplug
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High major (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-11-10 21:33 UTC by Dijital Munky
Modified: 2006-11-20 09:15 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
patch for drivers/usb/host/ehci-pci.c (ehci-pci-spinlock-fix.patch,478 bytes, patch)
2006-11-10 21:49 UTC, Dijital Munky
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dijital Munky 2006-11-10 21:33:36 UTC
In the gentoo-sources-2.6.18-r1, the ehci-hcd module hangs upon insertion of a USB2 device.  I have tried this with a USB Network adapter as well as a few different USB2 mass storage devices (2 USB drives and a sony PSP).

This also seems to exist in the 2.6.17 series of gentoo-source kernels.


Portage 2.1.1-r1 (default-linux/x86/2006.1/desktop, gcc-4.1.1, glibc-2.4-r4, 
2.6.18-gentoo-r1 i686)
=================================================================
System uname: 2.6.18-gentoo-r1 i686 AMD Athlon(tm) XP 2600+
Gentoo Base System version 1.12.6
Last Sync: Wed, 08 Nov 2006 09:50:01 +0000
distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.3 [enabled]
app-admin/eselect-compiler: [Not Present]
dev-java/java-config: 1.3.7, 2.0.30
dev-lang/python:     2.4.3-r4
dev-python/pycrypto: 2.0.1-r5
dev-util/ccache:     2.3
dev-util/confcache:  [Not Present]
sys-apps/sandbox:    1.2.17
sys-devel/autoconf:  2.13, 2.60
sys-devel/automake:  1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2
sys-devel/binutils:  2.16.1-r3
sys-devel/gcc-config: 1.3.13-r4
sys-devel/libtool:   1.5.22
virtual/os-headers:  2.6.17-r1
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=athlon-xp -pipe"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/"
CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/splash /etc/terminfo"
CXXFLAGS="-O2 -march=athlon-xp -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoconfig candy ccache distlocks metadata-transfer parallel-fetch sandbox sfperms strict usersandbox"
GENTOO_MIRRORS="http://gentoo.osuosl.org/ http://adelie.polymtl.ca/ ftp://gentoo.arcticnetwork.ca/pub/gentoo/"
LINGUAS="en_CA en_GB en"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude='/distfiles' --exclude='/local' --exclude='/packages'"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/portage/local/misc /usr/portage/local/layman/sunrise /usr/portage/local/layman/zugaina /usr/portage/local/layman/berkano"
SYNC="rsync://192.168.0.10/gentoo-portage"
USE="x86 3dnow 3dnowext X Xaw3d a52 aac aalib acpi addbookmarks aiglx alias alsa amr asf audiofile autoreplace bash-completion berkdb bitmap-fonts branding bzip2 cairo caps cdda cddb cdparanoia cdr chroot cli connectionstatus contactnotes cpudetection cracklib crypt css cups daap dbus dga directfb dlloader dmi dri dts dv dvb dvd dvdr dvdread dxr3 eds elibc_glibc emboss encode exscalibar extrafilters fam fame fbcon ffmpeg firefox flac flash fontconfig fortran fusion gadu gcj gdbm ggi gif gnutls gpm graphviz groupwise gs gstreamer gtk2 hal highlight history hpn iconv idn imagemagick imlib input_devices_evdev input_devices_joystick input_devices_keyboard input_devices_mouse irc isdnlog java javamail jbig jpeg jpeg2k justify kde kdeenablefinal kdrive kernel_linux latex lcms libcaca libedit libg++ linguas_en linguas_en_CA linguas_en_GB live lm_sensors logitech-mouse logrotate lua lzo mad mikmod mjpeg mmx mmxext mng mod modplug mozilla mp3 mp4 mpeg multiuser musepack musicbrainz ncurses nethack netmeeting network nls nowlistening nptl nptlonly nsplugin nvidia objc objc++ objc-gc offensive ogg openal opengl oss pam pam_chroot pam_console pam_timestamp parse-clocks pcre perl php png ppds pppd pwdb python qt3 qt4 quicktime rdesktop readline real reflection rtc rtsp ruby sametime sdl sensord session sftplogging shout skins slang slp sms sndfile socks5 spell spl sse ssl statistics stream svg svga symlink tcpd texteffect theora threads threadsafe tiff translator truetype truetype-fonts type1-fonts ucs2 udev unicode urandom userland_GNU utempter vcd video_cards_fbdev video_cards_nvidia vidix visualization vlm vorbis webpresence wifi win32codecs winpopup wmf x264 xanim xcomposite xine xinerama xml xorg xosd xpm xprint xscreensaver xv xvid xvmc yahoo yv12 zlib"
Unset:  CTARGET, EMERGE_DEFAULT_OPTS, INSTALL_MASK, LANG, LC_ALL, LDFLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 1 Dijital Munky 2006-11-10 21:45:17 UTC
Upon researching this issue on the big ol' net, I came across this link: 
http://groups.google.ca/group/linux.kernel/browse_thread/thread/adf460ef7a1d0452/eb4f8be0e1fcf961?lnk=st&q=ehci_hcd&rnum=2#eb4f8be0e1fcf961

I added the spin_lock_init(&ehci_lock); call to the appropriate place in drivers/usb/host/ehci-pci.c:ehci_pci_setup().  Upon re-compiling, rebooting and mod-probing the ehci_hcd module, USB2 hotplug functionality seems to be corrected.

I will add a patch for this once I get a chance to make it (it is my first time making a patch to submit as part of a bug).
Comment 2 Dijital Munky 2006-11-10 21:49:54 UTC
Created attachment 101648 [details, diff]
patch for drivers/usb/host/ehci-pci.c
Comment 3 Daniel Drake (RETIRED) gentoo-dev 2006-11-14 11:58:03 UTC
This isn't a valid fix, that report is from over a year ago. Please revert the patch, enable CONFIG_USB_DEBUG, and post debug logs from after a hang has occurred.
Comment 4 Dijital Munky 2006-11-14 20:49:58 UTC
(In reply to comment #3)
> This isn't a valid fix, that report is from over a year ago. Please revert the
> patch, enable CONFIG_USB_DEBUG, and post debug logs from after a hang has
> occurred.
> 

Perhaps you could explain why this isn't valid??  I applied it and it works, so from my POV it is valid.  Just because something is a  bit older, doesn't mean it is useless or invalid, if that were the case I would've stopped talking to my parents and grandparents years ago.... Whether it is a good idea or merely just a hack is another question.  My understanding is that it is more of the latter, however, if it fixes it until the kernel guys fix it properly, what's the issue??  Perhaps it could be a use flag that people could enable??

I will however turn CONFIG_USB_DEBUG on and try to get a better log from when the hang happens.  I assume that this logging all happens in the kernel log??
Comment 5 Daniel Drake (RETIRED) gentoo-dev 2006-11-15 07:02:29 UTC
The patch there is for an unrelated problem - spinlock corruption. It also is not a fix for that problem, it is a hackish workaround (it simply confirmed the symptoms, the real fix for that bug was very different]).

Also, that patch results with the spin lock being initialized twice therefore the state after the first invocation is forgotten/ignored at some point. This is bad, effectively overwriting important state data. In this case, the spin lock is already initialised in ehci_init() which is called by ehci_pci_setup() (the function you modified).

That aside, the fact that the hang is related to this lock may be a useful observation later, when more information has been provided.
Comment 6 Daniel Drake (RETIRED) gentoo-dev 2006-11-15 07:09:48 UTC
To answer the other questions, we do not apply hack patches, we do not apply non-upstream patches (other than in exceptional circumstances) and we do not apply patches based on USE flags. What we will do is ensure that a proper fix is developed and is included in the upstream Linux kernel.
For more info on our patching style see http://dev.gentoo.org/~dsd/genpatches/

This may seem a little inefficient in terms of speedy fixes but is the only way we've found a beast as big as the kernel to be maintainable with so few resources in the long term. If we had more resources and developers maybe we'd be able to rethink our maintenance style, right now it's just 2 people and typically only myself working on the bug reports.
Comment 7 Dijital Munky 2006-11-15 15:23:25 UTC
Okay, I did the following steps:

1. emerge -C =gentoo-sources-2.6.18-r1
2. rm -r /usr/src/linux-2.6.18-gentoo-r1
3. emerge -av gentoo-sources (ensured symlink on the USE variable was enabled)
4. cd /usr/src/linux
5. gzcat /proc/config.gz > .config
6. mount /boot
7. make oldconfig && make menuconfig
8. enabled CONFIG_USB_DEBUG.
9. Added -nopatch to the kernel version to ensure that I was running the unpatched version.
10. make && make modules_install && make install && reboot
11. uname -r to ensure that I was using the unpatched kernel.

Once rebooted, I plugged and unplugged all of the devices that would hang before.  This time everything looked after itself.  So I rebooted and tried the same thing again.  It appears that perhaps the patch didn't fix it after all, but rather, that while reading that thread (or another) I enabled something that my hardware required in order for USB 2.0 to work properly.

I am going to leave my kernel as is for a few days and reboot occasionally (till at least Friday just in case I am having a fluke ATM.  In the original problem, occasionally USB 2.0 would work upon reboot, but it would usually stop working at some point), and if I have no more problems with USB 2.0, I'll resolve as INVALID.

BTW, I appreciate your help and patience.  Especially the explanations I asked for.  They were a great help to a relative kernel code newb like me.  I also completely agree with your policy regarding patches...
Comment 8 Daniel Drake (RETIRED) gentoo-dev 2006-11-20 09:15:38 UTC
OK - please reopen if the problem is reproducible