Summary: | reset high speed USB device using ehci_hcd and address N | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Steve Arnold <nerdboy> |
Component: | [OLD] Core system | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
Status: | RESOLVED NEEDINFO | ||
Severity: | major | CC: | lmiphay, midnightflash, radhermit, stefan, udev-bugs |
Priority: | High | Keywords: | Inclusion |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
URL: | https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/61235 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- |
Description
Steve Arnold
![]() ![]() nerdboy: how does this affect performance on devices where there isn't a problem? Well gensys chipsets are speed limited by the kernel, as they have problems with too high speeds. But I do not know if the kernel limits them by max_sectors or by other ways. zzam: I meant for other non-Genesys hardware where there isn't a problem, that the max_sectors change may have an unnessicary negative effect on performance. I don't have any external USB storage anymore (i switched to eSATA gear for performance), but I have seen this problem in the past with some enclosures, due to poor manufactering (and if I used a really short cable, or added my own shielding, it works fine). Which kernel(s) have you reproduced this on? Well, I think reducing max_sectors is not that a problem - despite it reduces performance, but increasing max_sectors CAN be a problem if the already set limit is choosen well. I think we should not do this by default. Also I don't know exactly why the limit for gensys is exactly 64. But perhaps we can create some optional component to only decrease max_sectors, and not increase it. Something like usb_storage_max_sectors="xyz" in udevd.conf, but perhaps also create some additional package for it. I've seen it on 2.6.18/19/20, and according to the other posts it goes all the way back to early 2.6 and even 2.4 kernels (apparently any kernel with the ehci module and edgy USB hardware). Setting the max_sectors to 128 seems like a pretty safe default, since 1) it seems to work, 2) it even works with the gensys hardware (the kernel sets max_sectors to 64 in this particular case), and 3) it's not much of a performance hit overall. We can always document how to increase it (with a warning) for those who just have to squeeze every last ounce... If the kernel already sets it to 64 for your device, then surely no fix is needed in your case? Or are you saying that 64 doesn't work but 128 does? Neither... I'm saying that the gensys device is one of the only ones the kernel sets a specific (non-default) value for, which is max_sectors=64 (almost everything else apparently defaults to 240). And from what was reported by a couple of those folks in the other posts, the gensys hardware supposedly works fine at 128 (but no higher). Lastly, Windoze apparently sets everything to 128, and every reporter I saw that mentioned trying their problematic device on a Windoze box said it worked fine. Thus my conclusion, at least for now, is that 128 is a safe default setting, even for the gensys devices which the kernel currently limits to 64. OK. Well let me say we're not going to consider limiting max_sectors for all devices, as that is just silly. If you have a problem with a specific device, we'll help take the bug through the usual processes and hopefully find a fix, and the "fix" may be adding a quirk to limit max_sectors for that specific device. But if I understand it correctly, the kernel already caps max_sectors for your device, so no fix is needed for your specific case anyway? No, since I don't have a gensys USB controller. In this case, they're on-board an MSI motherboard and the device is an internal card reader plugged onto the extra USB header. The kernel specifically limits almost nothing (gensys is one of the only ones) - I was simply pointing out that even gensys is reported to work fine at 128. And I'm not sure it's all that silly to limit USB devices to a lightly lower default if it means not corrupting the user's data. Just my opinion... OK. Please test the latest development kernel (currently 2.6.22-rc1) with usb storage verbose debugging enabled, and attach dmesg logs from after a failure. Also post "lsusb -v" output. I'm also interested to know how you are able to reproduce this, and also how much success you have had with the max_sectors workaround. Thanks. It's not easy to reproduce in many cases, being both hardware and usage-dependent. I've had several cases on multiple machines where data was truncated or corrupted somehow copying multi-megabyte files (mostly mp3s) to USB devices (several different brands of media players with both flash and hard drive storage). The issue is compounded by the fact that desktop feedback often seems to indicate (to the user) that the copy operation is complete when it really isn't. I've had no instances of corruption or bad copies with the max_sectors=128, and as far as performance goes, I have a good example device - my Sharp laptop cradle mounts as a USB2 storage device, and the internal hard drive is only a 4300 rpm 1.8" disk. With the default settings, I was having problems unmounting after a chroot/build cycle, as well as occasional filesystem corruption (performance in general seems very slow with this drive). The drive also has a slow access time, with burst transfers hitting 10-12 Mb/sec and sustained around 4-5 Mb/sec. With the above 128 setting, I have 1) no problems with the filesystem corruption or unmounting, and 2) only the burst transfers are slower, while sustained transfers actually seem to have gotten faster (about 5-6 Mb/sec). Other's MMV, since this is kind of a peculiar hard disk, but things are much cleaner and less problematic now, and the performance hit appears to be negligible. The kids are also not having problems with their mp3/video players anymore. I won't be able to seriously test any new kernels until after I finish the final grades in a week or so. OK, please reopen when you have time to provide info and testing for one particular case. If there are others remaining after we have addressed the first, you are encouraged to file more bug reports. To expand on comment #9: I was simply referring to the fact that Gentoo are not prepared to change the default in the kernel sources here -- I'm not commenting on whether your suggestion is or isn't sensible. If you truly think it should be changed, you should start upstream discussion by contacting the linux-usb-devel mailing list. When upstream "fixes" it we'll pull in the changes no questions asked, and also all distro's will benefit. Kernel: 2.6.22-gentoo-r8 USB-Chipset: nVidia Corporation MCP2A USB Controller And the Problem is still there. Seems to an upstram-problem to me, because i have the same problem on the same machine with Ubuntu (Kernel: 2.6.22-12-generic) usb 1-1: new high speed USB device using ehci_hcd and address 11 ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes ehci_hcd 0000:00:02.1: port 1 high speed ehci_hcd 0000:00:02.1: GetStatus port 1 status 001007 POWER sig=se0 PE CSC CONNECT hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002 hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002 ehci_hcd 0000:00:02.1: GetStatus port 1 status 001803 POWER sig=j CSC CONNECT hub 1-0:1.0: port 1, status 0501, change 0001, 480 Mb/s hub 1-0:1.0: debounce: port 1: total 100ms stable 100ms status 0x501 ehci_hcd 0000:00:02.1: port 1 high speed ehci_hcd 0000:00:02.1: GetStatus port 1 status 001005 POWER sig=se0 PE CONNECT usb 1-1: new high speed USB device using ehci_hcd and address 12 ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes ehci_hcd 0000:00:02.1: devpath 1 ep0in 3strikes ehci_hcd 0000:00:02.1: port 1 high speed ehci_hcd 0000:00:02.1: GetStatus port 1 status 001007 POWER sig=se0 PE CSC CONNECT hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002 hub 1-0:1.0: state 7 ports 10 chg 0000 evt 0002 ehci_hcd 0000:00:02.1: GetStatus port 1 status 001803 POWER sig=j CSC CONNECT hub 1-0:1.0: port 1, status 0501, change 0001, 480 Mb/s ...is this the same error as mentioned above? Hi, I'm having the same problem, tried with several different kernels (2.6.26-r2/4 and 2.6.27-r10 and well as 2.6.28-r4) and changing max_sectors doesn't do the trick. This is really annoying, it usually happens here there are large transactions like extracting large files or something like that. Does any one have a clue on this issue? It could be hardware problem, but that's odd... Here's my output. May 14 20:22:16 athlon kernel: usb 1-1: reset high speed USB device using ehci_hcd and address 2 May 14 20:22:56 athlon last message repeated 2 times May 14 20:22:57 athlon kernel: usb 1-1: USB disconnect, address 2 May 14 20:22:57 athlon kernel: sd 4:0:0:0: Device offlined - not ready after error recovery May 14 20:22:57 athlon kernel: sd 4:0:0:0: [sdb] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK,SUGGEST_OK May 14 20:22:57 athlon kernel: end_request: I/O error, dev sdb, sector 468830495 May 14 20:22:57 athlon kernel: sd 4:0:0:0: rejecting I/O to offline device May 14 20:22:57 athlon kernel: sd 4:0:0:0: [sdb] Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK,SUGGEST_OK May 14 20:22:57 athlon kernel: end_request: I/O error, dev sdb, sector 468830623 May 14 20:22:57 athlon kernel: sd 4:0:0:0: rejecting I/O to offline device If anybody still manages to reproduce this problem, could you try the following patches and report if they improve situation: http://bugzilla.kernel.org/show_bug.cgi?id=11159#c68 With 2.6.30-gentoo-r4, and these errors from a newer Corsair Voyager 8GB usb key: ... usb 2-1.2: reset high speed USB device using ehci_hcd and address 11 usb 2-1.2: device descriptor read/64, error -110 usb 2-1.2: device descriptor read/64, error -110 usb 2-1.2: reset high speed USB device using ehci_hcd and address 11 usb 2-1.2: device descriptor read/64, error -110 usb 2-1.2: device descriptor read/64, error -110 ... This fixes the problem: echo 20 > /sys/module/scsi_mod/parameters/inq_timeout Unlikely to be the same problem as seen by the OP, but the bug summary is an exact match for the initial error message when I searched. reference/source: https://bugs.launchpad.net/ubuntu/+bug/123167 |