When moving large amounts of data - or accessing both discs in the array simultaneously, the Promise will die with a "ata1: command time out" and the system will just hang, then, at other times it will fail to write and XFS will disable the file-system after an out of sync error. /var/log/messages is attached. This is a creepy bug - and it's really a nasty block, it renders the machine unusable. The kernel people have not... replied yet, perhaps we can do something about it. Reports from opensubscriber.com especulate that actually the problem is in the fact that sometimes the irq does not get sent back from the card to the kernel... Reproducible: Always Steps to Reproduce: Use a Promise Sata 378 (TX2Plus) controller with one or more discs, and attempt to access both at the same time or to move a large amount of data into either of them (or out), and the system will CRASH. Sometimes you will get an atax: command timeout. Others, XFS errors on the volumes. Note that this affects the PDC20378 in my case, probably other pdc203xx family members too Actual Results: The kernel will crash, no questions about it. Expected Results: It should have continued working normally.
Created attachment 70026 [details] Kernel config
Created attachment 70027 [details] /var/log/messages
LSPCI 0000:00:00.0 Host bridge: Intel Corporation 915G/P/GV/GL/PL/910GL Processor to I/O Controller (rev 04) 0000:00:01.0 PCI bridge: Intel Corporation 915G/P/GV/GL/PL/910GL PCI Express Root Port (rev 04) 0000:00:1b.0 Class 0403: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (rev 03) 0000:00:1d.0 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #1 (rev 03) 0000:00:1d.1 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #2 (rev 03) 0000:00:1d.2 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #3 (rev 03) 0000:00:1d.3 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB UHCI #4 (rev 03) 0000:00:1d.7 USB Controller: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) USB2 EHCI Controller (rev 03) 0000:00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d3) 0000:00:1f.0 ISA bridge: Intel Corporation 82801FB/FR (ICH6/ICH6R) LPC Interface Bridge (rev 03) 0000:00:1f.1 IDE interface: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) IDE Controller (rev 03) 0000:00:1f.3 SMBus: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) SMBus Controller (rev 03) 0000:01:00.0 VGA compatible controller: nVidia Corporation NV41.8 [GeForce Go 6800] (rev a2) 0000:0a:00.0 CardBus bridge: Texas Instruments PCI1410 PC card Cardbus Controller (rev 02) 0000:0a:01.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link) 0000:0a:02.0 RAID bus controller: Promise Technology, Inc. PDC20378 (FastTrak 378/SATA 378) (rev 02) 0000:0a:03.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10) 0000:0a:04.0 Multimedia controller: Philips Semiconductors SAA7133 Video Broadcast Decoder (rev f0)
Is this problem new to 2.6.13? Is it reproducible on the latest development kernel (currently vanilla-sources-2.6.14_rc3)?
It probably is - the problem is in the SATA 1.1 driver attached to the libata. If there have been changes on to these bits, then maybe the problem has been resolved...
We're moving to filing an upstream kernel bug, in which case you *need* to reproduce it on the latest vanilla kernel. Rereading your original post, it sounds like you might have done this already. Is this so?
The bug does occur on 2.6.13 and 2.16.10 Vanilla.
Please reopen when you have tested on the latest development kernel (currently vanilla-sources-2.6.14_rc5)
Turns out the bug is not really a bug. It's rather a fucked-up Fujitsu Disc.