Bug 177266

Summary:	reset high speed USB device using ehci_hcd and address N
Product:	Gentoo Linux	Reporter:	Steve Arnold <nerdboy>
Component:	[OLD] Core system	Assignee:	Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel>
Status:	RESOLVED NEEDINFO
Severity:	major	CC:	lmiphay, midnightflash, radhermit, stefan, udev-bugs
Priority:	High	Keywords:	Inclusion
Version:	unspecified
Hardware:	All
OS:	Linux
URL:	https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/61235
Whiteboard:
Package list:		Runtime testing required:	---

Description Steve Arnold archtester

2007-05-06 00:12:58 UTC

The above kernel messages, along with filesystem corruption, seem to be a function of kernel version and specific hardware combinations of USB device, USB host controller chipset, cable, etc.  Changing any of the latter often seems to fix the problem, however, the optional workaround for the moment appears to be setting the max_sectors on the block device to 128.  This seems to have a very minor performance impact if any, and even works with the known Gensys chipset problem (which the kernel code currently limits to 64).

More information is available in the following links:

http://forums.debian.net/viewtopic.php?t=1102

http://www.mail-archive.com/linux-usb-users@lists.sourceforge.net/msg15608.html

http://taint.org/2006/12/13/191554a.html

Not sure if there's a relevant upstream kernel bug, so here's udev rule that should apply the above change to all USB drives and disk-like devices (ie, such as an internal USB card reader).  It works for my card reader...

SUBSYSTEM=="block", BUS=="usb", KERNEL=="sd*", RUN+="/bin/sh -c 'echo 128 > /sys
/block/%k/device/max_sectors'"

Comment 1 Robin Johnson archtester

2007-05-06 00:47:17 UTC

nerdboy: how does this affect performance on devices where there isn't a problem?

Comment 2 Matthias Schwarzott gentoo-dev

2007-05-06 06:10:21 UTC

Well gensys chipsets are speed limited by the kernel, as they have problems with too high speeds. But I do not know if the kernel limits them by max_sectors or by other ways.

Comment 3 Robin Johnson archtester

2007-05-06 11:12:36 UTC

zzam: I meant for other non-Genesys hardware where there isn't a problem, that the max_sectors change may have an unnessicary negative effect on performance.

I don't have any external USB storage anymore (i switched to eSATA gear for performance), but I have seen this problem in the past with some enclosures, due to poor manufactering (and if I used a really short cable, or added my own shielding, it works fine).

Comment 4 Daniel Drake (RETIRED) gentoo-dev

2007-05-06 21:13:52 UTC

Which kernel(s) have you reproduced this on?

Comment 5 Matthias Schwarzott gentoo-dev

2007-05-07 09:08:09 UTC

Well, I think reducing max_sectors is not that a problem - despite it reduces performance, but increasing max_sectors CAN be a problem if the already set limit is choosen well.

I think we should not do this by default.
Also I don't know exactly why the limit for gensys is exactly 64.

But perhaps we can create some optional component to only decrease max_sectors, and not increase it.

Something like usb_storage_max_sectors="xyz" in udevd.conf, but perhaps also create some additional package for it.

Comment 6 Steve Arnold archtester

2007-05-08 06:33:27 UTC

I've seen it on 2.6.18/19/20, and according to the other posts it goes all the way back to early 2.6 and even 2.4 kernels (apparently any kernel with the ehci module and edgy USB hardware).

Setting the max_sectors to 128 seems like a pretty safe default, since 1) it seems to work, 2) it even works with the gensys hardware (the kernel sets max_sectors to 64 in this particular case), and 3) it's not much of a performance hit overall.

We can always document how to increase it (with a warning) for those who just have to squeeze every last ounce...

Comment 7 Daniel Drake (RETIRED) gentoo-dev

2007-05-08 12:16:43 UTC

If the kernel already sets it to 64 for your device, then surely no fix is needed in your case?

Or are you saying that 64 doesn't work but 128 does?

Comment 8 Steve Arnold archtester

2007-05-10 06:31:33 UTC

Neither...  I'm saying that the gensys device is one of the only ones the kernel sets a specific (non-default) value for, which is max_sectors=64 (almost everything else apparently defaults to 240).  And from what was reported by a couple of those folks in the other posts, the gensys hardware supposedly works fine at 128 (but no higher).  Lastly, Windoze apparently sets everything to 128, and every reporter I saw that mentioned trying their problematic device on a Windoze box said it worked fine.  Thus my conclusion, at least for now, is that 128 is a safe default setting, even for the gensys devices which the kernel currently limits to 64.

Comment 9 Daniel Drake (RETIRED) gentoo-dev

2007-05-10 12:13:44 UTC

OK. Well let me say we're not going to consider limiting max_sectors for all devices, as that is just silly.

If you have a problem with a specific device, we'll help take the bug through the usual processes and hopefully find a fix, and the "fix" may be adding a quirk to limit max_sectors for that specific device.

But if I understand it correctly, the kernel already caps max_sectors for your device, so no fix is needed for your specific case anyway?

Comment 10 Steve Arnold archtester

2007-05-11 08:12:14 UTC

No, since I don't have a gensys USB controller.  In this case, they're on-board an MSI motherboard and the device is an internal card reader plugged onto the extra USB header.  The kernel specifically limits almost nothing (gensys is one of the only ones) - I was simply pointing out that even gensys is reported to work fine at 128.

And I'm not sure it's all that silly to limit USB devices to a lightly lower default if it means not corrupting the user's data.  Just my opinion...

Comment 11 Daniel Drake (RETIRED) gentoo-dev

2007-05-17 03:29:55 UTC

OK. Please test the latest development kernel (currently 2.6.22-rc1) with usb storage verbose debugging enabled, and attach dmesg logs from after a failure. Also post "lsusb -v" output.

I'm also interested to know how you are able to reproduce this, and also how much success you have had with the max_sectors workaround. Thanks.

Comment 12 Steve Arnold archtester

2007-05-19 23:39:48 UTC

It's not easy to reproduce in many cases, being both hardware and usage-dependent. I've had several cases on multiple machines where data was truncated or corrupted somehow copying multi-megabyte files (mostly mp3s) to USB devices (several different brands of media players with both flash and hard drive storage). The issue is compounded by the fact that desktop feedback often seems to indicate (to the user) that the copy operation is complete when it really isn't.

I've had no instances of corruption or bad copies with the max_sectors=128, and as far as performance goes, I have a good example device - my Sharp laptop cradle mounts as a USB2 storage device, and the internal hard drive is only a 4300 rpm 1.8" disk. With the default settings, I was having problems unmounting after a chroot/build cycle, as well as occasional filesystem corruption (performance in general seems very slow with this drive). The drive also has a slow access time, with burst transfers hitting 10-12 Mb/sec and sustained around 4-5 Mb/sec. With the above 128 setting, I have 1) no problems with the filesystem corruption or unmounting, and 2) only the burst transfers are slower, while sustained transfers actually seem to have gotten faster (about 5-6 Mb/sec).

Other's MMV, since this is kind of a peculiar hard disk, but things are much cleaner and less problematic now, and the performance hit appears to be negligible. The kids are also not having problems with their mp3/video players anymore.

I won't be able to seriously test any new kernels until after I finish the final grades in a week or so.

Comment 13 Daniel Drake (RETIRED) gentoo-dev

2007-05-20 04:40:36 UTC

OK, please reopen when you have time to provide info and testing for one particular case. If there are others remaining after we have addressed the first, you are encouraged to file more bug reports.

To expand on comment #9: I was simply referring to the fact that Gentoo are not prepared to change the default in the kernel sources here -- I'm not commenting on whether your suggestion is or isn't sensible. If you truly think it should be changed, you should start upstream discussion by contacting the linux-usb-devel mailing list. When upstream "fixes" it we'll pull in the changes no questions asked, and also all distro's will benefit.

Comment 14 midnightflash 2007-10-08 00:02:14 UTC

Kernel: 2.6.22-gentoo-r8
USB-Chipset:  nVidia Corporation MCP2A USB Controller

And the Problem is still there. Seems to an upstram-problem to me, because i have the same problem on the same machine with Ubuntu (Kernel: 2.6.22-12-generic)

Comment 15 Stefan de Konink 2007-12-25 12:46:22 UTC

usb 1-1: new high speed USB device using ehci_hcd and address 11
ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes
ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes
ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes
ehci_hcd 0000:00:02.1: port 1 high speed
ehci_hcd 0000:00:02.1: GetStatus port 1 status 001007 POWER sig=se0 PE CSC CONNECT
hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002
hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002
ehci_hcd 0000:00:02.1: GetStatus port 1 status 001803 POWER sig=j CSC CONNECT
hub 1-0:1.0: port 1, status 0501, change 0001, 480 Mb/s
hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x501
ehci_hcd 0000:00:02.1: port 1 high speed
ehci_hcd 0000:00:02.1: GetStatus port 1 status 001005 POWER sig=se0 PE CONNECT
usb 1-1: new high speed USB device using ehci_hcd and address 12
ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes
ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes
ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes
ehci_hcd 0000:00:02.1: port 1 high speed
ehci_hcd 0000:00:02.1: GetStatus port 1 status 001007 POWER sig=se0 PE CSC CONNECT
hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002
hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002
ehci_hcd 0000:00:02.1: GetStatus port 1 status 001803 POWER sig=j CSC CONNECT
hub 1-0:1.0: port 1, status 0501, change 0001, 480 Mb/s

...is this the same error as mentioned above?

Comment 16 Rui Vilao 2009-05-14 19:30:00 UTC

Hi,

I'm having the same problem, tried with several different kernels (2.6.26-r2/4 and 2.6.27-r10 and well as 2.6.28-r4) and changing max_sectors doesn't do the trick. This is really annoying, it usually happens here there are large transactions like extracting large files or something like that. Does any one have a clue on this issue? It could be hardware problem, but that's odd...

Here's my output.

May 14 20:22:16 athlon kernel: usb 1-1: reset high speed USB device using ehci_hcd and address 2
May 14 20:22:56 athlon last message repeated 2 times
May 14 20:22:57 athlon kernel: usb 1-1: USB disconnect, address 2
May 14 20:22:57 athlon kernel: sd 4:0:0:0: Device offlined - not ready after error recovery
May 14 20:22:57 athlon kernel: sd 4:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK,SUGGEST_OK
May 14 20:22:57 athlon kernel: end_request: I/O error, dev sdb, sector 468830495
May 14 20:22:57 athlon kernel: sd 4:0:0:0: rejecting I/O to offline device
May 14 20:22:57 athlon kernel: sd 4:0:0:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK
May 14 20:22:57 athlon kernel: end_request: I/O error, dev sdb, sector 468830623
May 14 20:22:57 athlon kernel: sd 4:0:0:0: rejecting I/O to offline device

Comment 17 Peter Volkov (RETIRED) gentoo-dev

2009-06-04 09:22:10 UTC

If anybody still manages to reproduce this problem, could you try the following patches and report if they improve situation:

http://bugzilla.kernel.org/show_bug.cgi?id=11159#c68

Comment 18 Paul Healy 2009-08-29 18:37:14 UTC

With 2.6.30-gentoo-r4, and these errors from a newer Corsair Voyager 8GB usb key:
...
usb 2-1.2: reset high speed USB device using ehci_hcd and address 11
usb 2-1.2: device descriptor read/64, error -110
usb 2-1.2: device descriptor read/64, error -110
usb 2-1.2: reset high speed USB device using ehci_hcd and address 11
usb 2-1.2: device descriptor read/64, error -110
usb 2-1.2: device descriptor read/64, error -110
...

This fixes the problem:

 echo 20 > /sys/module/scsi_mod/parameters/inq_timeout

Unlikely to be the same problem as seen by the OP, but the bug summary is an exact match for the initial error message when I searched.

reference/source: https://bugs.launchpad.net/ubuntu/+bug/123167