Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 236234 - sata_promise driver reported in the wild as unstable at 3 Gbits/s
Summary: sata_promise driver reported in the wild as unstable at 3 Gbits/s
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL: http://user.it.uu.se/~mikpe/linux/pat...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-30 23:29 UTC by kfm
Modified: 2008-10-28 22:51 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kfm 2008-08-30 23:29:08 UTC
I'm filing this in view of an exchange I had with a user in the #gentoo channel. He had been in the process of migrating to a software-based 4-way RAID-5 setup and was at the final stage of the process which was to add the 4th device (hiterhto missing) to the array. Doing so triggered a resync as expected which, of course, can make for a fairly intensive workload. Unfortunately, the process went awry:

Aug 29 20:48:36 frummel ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x1380000 action 0x2 frozen
Aug 29 20:48:36 frummel ata2: SError: { 10B8B Dispar BadCRC TrStaTrns }
Aug 29 20:48:36 frummel ata2.00: cmd 25/00:f8:3f:28:32/00:03:13:00:00/e0 tag 0 dma 520192 in
Aug 29 20:48:36 frummel res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Aug 29 20:48:36 frummel ata2.00: status: { DRDY }
Aug 29 20:48:41 frummel ata2: port is slow to respond, please be patient (Status 0xff)
Aug 29 20:48:46 frummel ata2: device not ready (errno=-16), forcing hardreset
Aug 29 20:48:46 frummel ata2: hard resetting link
Aug 29 20:48:52 frummel ata2: port is slow to respond, please be patient (Status 0xff)
Aug 29 20:48:56 frummel ata2: COMRESET failed (errno=-16)
Aug 29 20:48:56 frummel ata2: hard resetting link
Aug 29 20:50:46 frummel ata2: reset failed, giving up
Aug 29 20:50:46 frummel sd 1:0:0:0: [sdb] Result: hostbyte=0x00 driverbyte=0x08
Aug 29 20:50:46 frummel sd 1:0:0:0: [sdb] Sense Key : 0xb [current] [descriptor]
Aug 29 20:50:46 frummel Descriptor sense data with sense descriptors (in hex):
Aug 29 20:50:46 frummel 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
Aug 29 20:50:46 frummel 00 00 00 00
Aug 29 20:50:46 frummel sd 1:0:0:0: [sdb] ASC=0x0 ASCQ=0x0
Aug 29 20:50:46 frummel end_request: I/O error, dev sdb, sector 322054207
Aug 29 20:50:46 frummel raid5:md4: read error not correctable (sector 322054144
on sdb1).

Fortunately, his data survived this mishap. As it turned out, the user had already conducted some research and found this thread:

http://www.mail-archive.com/linux-ide%40vger.kernel.org/msg10106.html

In summary, it seems that one of two things may be the case:

1) The contoller simply isn't stable in SATA 300 mode
2) The driver is buggy or unable to drive the controller reliably in SATA 300
   mode for some unknown reason

Unfortunately, I didn't establish precisely which kernel he was running but I know that it was a version of ~gentoo-sources-2.6.25.

Given that the above mentioned post dates from September 2007, it seems that the issue has been observed since around 2.6.21. Further, there is a patch that works around the problem simply by arbitrarily dropping to SATA 150 (1.5 Gbits/s) mode. As of the time of posting, it is still available here for a recent 2.6.27 release candidate:

http://user.it.uu.se/~mikpe/linux/patches/2.6/patch-sata_promise-limit-sataii-to-1.5Gbps-2.6.27-rc4

The questions that spring to my mind are:

* How widespread is this problem? 
* Can anything be done about it?
* If not, should we adopt this patch until such time as the issue is addressed?
Comment 1 Daniel Drake (RETIRED) gentoo-dev 2008-10-28 22:51:28 UTC
The patch link is dead, so I can't check if this is included yet. I would suggest taking this to the linux-ide list.