new Via motherboards such as the Abit KR7a-RAID have an onboard IDE and an onboard RAID controller chip which you can put a hard drive on. the chip is the HPT372 controller. Booting off any hard drive in the system while this controller is enabled will lead to a kernel panic, in gentoo's case with 2.4.18-pre 3 it boots but their are I/O errors accessing the disk and garbled text on boot up where their should be initilization of the HPT372 chip. this goes for every motherboard with this controller chip. Apparently linux will look for the raid controllers device ID, 1-4, well this new controller chip has a device ID of 5. so when linux sees 5 and it has no default, it panics. Below is a more technical explination: In drivers/ide/ide-pci.c, in the definition of function hpt366_device_order_fixup, the line: pci_read_config_dword(dev, PCI_CLASS_REVISION, &class_rev); reads class_rev and gets 5 on the abit kr7a-raid system. Since the code is only prepared for 4 or less, the problem appears in the subsequent line: strcpy(d->name, chipset_names[class_rev]); where chipset_names[class_rev]=chipset_names[5] attempts to access an invalid array element, given that chipset_names[4] is the last valid elelment. After the lines pci_read_config_dword(dev, PCI_CLASS_REVISION, &class_rev); class_rev &= 0xff; adding these two check lines solves the boot problem: if( class_rev >= (sizeof(chipset_names)/sizeof(char *)) ) { class_rev = (sizeof(chipset_names)/sizeof(char *)) - 1; } The same problem exists for drivers/ide/hpt366.c and needs the same fix. So after the lines pci_read_config_dword(bmide_dev, PCI_CLASS_REVISION, &class_rev); class_rev &= 0xff; these two lines must be added: if( class_rev >= (sizeof(chipset_names)/sizeof(char *)) ) { class_rev = (sizeof(chipset_names)/sizeof(char *)) - 1; } Also, in drivers/ide/ide-pci.c, in the definition of function hpt366_device_order_fixup, the line: strcpy(d->name, chipset_names[class_rev]); would (for a test case) copy a 7 char string "HPT370A" into and over the string "HPT366" destroying the terminating null byte. What this may lead to is unknown but has been fixed by replacing this statement by a strncpy: /* DON'T COPY 7 CHARS (HPT370A) OVER 6 CHARS (HPT366) */ strncpy(d->name, chipset_names[class_rev], strlen(d->name)); KERNEL VERSIONS AFFECTED The following kernels are known to have this bug -> 2.4.8 (mandrake-8.1) 2.4.7 (redhat-7.2) 2.4.17 2.4.18 AFFECTED FILES drivers/ide/ide-pci.c drivers/ide/hpt366.c the problem still exists in 2.4.18, Mandrake 8.2 does not boot at all, Gentoo will boot but with limited success. this problem SHOULD be addressed using the fix above or some other method, many users of these motherboards were dissapointed ot find ou that linux didnt boot with this controller enabled. Thank you. Nicholas DePetrillo
It appears that this bug has already been resolved by a patch in 2.4.19-pre2-ac3 submitted by Andre Hedrick. xref Bug #321 Prove to me that is not, or I will mark this as a duplicate of 321. -- tabris
it is indeed a duplicate, i am VERY glad to see future versions of the kernel will resolve this issue. this is an official solution correct? every version of the kernel after 2.4.19-pre2-ac3 should include support for HPT372. Thank you Nicholas DePetrillo
Fwiw, the patch was submitted by the "official" ATA/IDE maintainer. However, do understand that the ac branch is separate from the marcelo tree, and the merge date is not guaranteed. However, I see no reason that Alan won't: #1 keep it in his tree; #2 push it to Marcelo (if ata hasn't already). *** This bug has been marked as a duplicate of 321 ***