Someone in my office just got a Dell Insperon 1501 that sports an ATI AHCI sata chipset. We quickly discovered that Ubuntu x86-64 6.06 & 6.10 absolutely wouldn't report a block decide for the SATA harddisk, even after loading AHCI (and all of the other sata drivers) by hand. While the 'parted' livecd could see the disk after loading 'ahci' by hand. It turns out that Gentoo 2006.1/amd64 also could not see the disk, even with all of the sata drivers loaded manually. However, Gentoo 2006.1/i686 has no problem seeing the device. I'm wondering if this is some sort of 64bit portability issue. Reproducible: Always Steps to Reproduce: 1.Boot a amd64 build of 2.6.15 - 2.6.19
Created attachment 106468 [details] The output of lspci -vvv
I am the afore-mentioned luser. I was able to successfully boot the Gentoo amd64 LiveCD using the boot options "irqpoll pci=nomsi" as suggested on some web pages (I have sinced completed the install and am writing from the troublesome machine!). Scanning Google results, I believe the "irqpoll" is required because I had a USB mouse plugged in when trying to boot, while the "pci=nomsi" option was needed to recognise the HD.
See http://forums.gentoo.org/viewtopic-p-3849437.html
Still reproducible on the latest testing kernel, currently 2.6.20-rc6?
Built and installed 2.6.20-rc6 (only my second ever kernel build, so please bear with me if I'm doing something stupid). Booting with no kernel options except doscsi produces: ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:12.0: flags 64bit ncq ilck pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xFFFFC20000016180 ctl 0x0 bmdma 0x0 irq 1277 ata2: SATA max UDMA/133 cmd 0xFFFFC20000016180 ctl 0x0 bmdma 0x0 irq 1277 ata3: SATA max UDMA/133 cmd 0xFFFFC20000016200 ctl 0x0 bmdma 0x0 irq 1277 ata4: SATA max UDMA/133 cmd 0xFFFFC20000016200 ctl 0x0 bmdma 0x0 irq 1277 scsi : ahci ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [HANGS for a bit] ata1.00: qc timeout (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [HANGS for a bit] ata1.00: qc timeout (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) [HANGS for a bit] ata1.00: qc timeout (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300) scsi1 : ahci ata2: SATA link down (SStatus 0 SControl 300) scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) [...] !! Block device /dev/sda7 is not a valid root device... !! The root block device is unspecified or not detected. Using boot option "pci=nomsi", I was able to boot succesfully (though I saw some of the messages above flash by). Option irqpoll is not required.
Please attach dmesg output from successful 2.6.20-rc6 boot with pci=nomsi
Created attachment 108228 [details] dmesg output from booting into 2.6.20-rc6
(In reply to comment #5) > Using boot option "pci=nomsi", I was able to boot succesfully Thank you, pci=nomsi saved my day with VT8251 and 2.6.19-gentoo-r4. The last kernel I was able to use before applying pci=nomsi was 2.6.17-gentoo-r6. Some messages from /var/log/messages: 1) 2.6.17-gentoo-r6 libata version 1.20 loaded. ahci 0000:00:0f.0: version 1.2 acpi_bus-0201 [01] bus_set_power : Device is not power manageable GSI 18 sharing vector 0xC9 and IRQ 18 ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 21 (level, low) -> IRQ 201 ahci 0000:00:0f.0: AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xFFFFC20000006D00 ctl 0x0 bmdma 0x0 irq 209 ata2: SATA max UDMA/133 cmd 0xFFFFC20000006D80 ctl 0x0 bmdma 0x0 irq 209 ata3: SATA max UDMA/133 cmd 0xFFFFC20000006E00 ctl 0x0 bmdma 0x0 irq 209 ata4: SATA max UDMA/133 cmd 0xFFFFC20000006E80 ctl 0x0 bmdma 0x0 irq 209 ata1: SATA link up 3.0 Gbps (SStatus 123) ata1: dev 0 cfg 49:2f00 82:746b 83:7f01 84:4023 85:7469 86:3c01 87:4023 88:80ff ata1: dev 0 ATA-7, max UDMA7, 488397168 sectors: LBA48 ata1: dev 0 configured for UDMA/133 scsi0 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123) ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01 87:4023 88:407f ata2: dev 0 ATA-7, max UDMA/133, 625142448 sectors: LBA48 ata2: dev 0 configured for UDMA/133 scsi1 : ahci ata3: SATA link down (SStatus 0) scsi2 : ahci ata4: SATA link down (SStatus 0) scsi3 : ahci 2) 2.6.19-gentoo-r4 without pci=nomsi ahci 0000:00:0f.0: version 2.0 ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 21 (level, low) -> IRQ 21 ahci 0000:00:0f.0: AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xFFFFC20000006D00 ctl 0x0 bmdma 0x0 irq 318 ata2: SATA max UDMA/133 cmd 0xFFFFC20000006D80 ctl 0x0 bmdma 0x0 irq 318 ata3: SATA max UDMA/133 cmd 0xFFFFC20000006E00 ctl 0x0 bmdma 0x0 irq 318 ata4: SATA max UDMA/133 cmd 0xFFFFC20000006E80 ctl 0x0 bmdma 0x0 irq 318 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) APIC error on CPU0: 00(08) APIC error on CPU1: 00(08) ata1.00: qc timeout (cmd 0xec) ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata1: port is slow to respond, please be patient (Status 0x80) ata1: port failed to respond (30 secs, Status 0x80) ata1: COMRESET failed (device not ready) ata1: hardreset failed, retrying in 5 secs ata1: port is slow to respond, please be patient (Status 0x80) ata1: port failed to respond (30 secs, Status 0x80) ata1: COMRESET failed (device not ready) ata1: hardreset failed, retrying in 5 secs ata1: port is slow to respond, please be patient (Status 0x80) ata1: port failed to respond (30 secs, Status 0x80) ata1: COMRESET failed (device not ready) ata1: reset failed, giving up scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) APIC error on CPU1: 08(08) APIC error on CPU0: 08(08) ata2.00: qc timeout (cmd 0xec) ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104) ata2: port is slow to respond, please be patient (Status 0x80) ata2: port failed to respond (30 secs, Status 0x80) ata2: COMRESET failed (device not ready) ata2: hardreset failed, retrying in 5 secs ata2: port is slow to respond, please be patient (Status 0x80) ata2: port failed to respond (30 secs, Status 0x80) ata2: COMRESET failed (device not ready) ata2: hardreset failed, retrying in 5 secs ata2: port is slow to respond, please be patient (Status 0x80) ata2: port failed to respond (30 secs, Status 0x80) ata2: COMRESET failed (device not ready) ata2: reset failed, giving up scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300) 3) 2.6.19-gentoo-r4 pci=nomsi ahci 0000:00:0f.0: version 2.0 ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 21 (level, low) -> IRQ 21 ahci 0000:00:0f.0: AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl SATA mode ahci 0000:00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part ata1: SATA max UDMA/133 cmd 0xFFFFC20000006D00 ctl 0x0 bmdma 0x0 irq 21 ata2: SATA max UDMA/133 cmd 0xFFFFC20000006D80 ctl 0x0 bmdma 0x0 irq 21 ata3: SATA max UDMA/133 cmd 0xFFFFC20000006E00 ctl 0x0 bmdma 0x0 irq 21 ata4: SATA max UDMA/133 cmd 0xFFFFC20000006E80 ctl 0x0 bmdma 0x0 irq 21 scsi0 : ahci ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata1.00: ATA-7, max UDMA7, 488397168 sectors: LBA48 NCQ (depth 0/32) ata1.00: ata1: dev 0 multi count 16 ata1.00: configured for UDMA/133 scsi1 : ahci ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300) ata2.00: ATA-7, max UDMA/133, 625142448 sectors: LBA48 NCQ (depth 0/32) ata2.00: ata2: dev 0 multi count 16 ata2.00: configured for UDMA/133 scsi2 : ahci ata3: SATA link down (SStatus 0 SControl 300) scsi3 : ahci ata4: SATA link down (SStatus 0 SControl 300)
(In reply to comment #8) > Thank you, pci=nomsi saved my day with VT8251 and 2.6.19-gentoo-r4. Well, I celebrated too early. What followed was quite catastrophic and cryptic to me; might be connected as well as not. The mayhem started inconspicuously with the eth0 interface ceasing apparently from existence in mid-run without my intervention. After I had realised the interface had been really out some queer way, not willing to be brought up, I rebooted my machine -- unfortunately I don't remember, if I managed that time to start the computer or if it froze immediately during the first reboot. Anyway it froze in the end and refused to boot at all due to my root filesystem having gone away. I didn't rescue my reiserfs 3.6 even by reiserfsck --rebuild-sb, the new superblock was at least half nonsense (for example the hash function not set), and subsequent reiserfsck --rebuild-tree brought my filesystem back to life with a lot of files missing or corrupted. I booted into the tried 2.6.27-r6 kernel and more or less reinstalled the system. May be less -- though I thought I had reinstalled it whole --, because after some time I had it frozen again. Perhaps, having compiled the gentoo 2.6.20 kernel in the meantime, I should not have tried to boot into it, but I cannot exclude the possibility that the new corruption of my filesystem happened when it froze already. Anyway I am reinstalling my system again and considering a purchase of a new disk to try new installation from scratch (while keeping all my not yet lost data aside). If I rescue anything like a log and manage to find there anything looking relevant, I'll report the details, of course -- whether here or in a bugreport of its own.
Coupl'a questions for Joshua... - You mention that i686 kernel works, while x86_64 kernel doesn't. Are these two kernels the same version? - Turning off MSI on x86_64 seems to be helping some people with the issue you described. Is the problem reproducible with MSI interrupts *enabled* on i686, or do the problems persist with MSI interrupts enabled on the i686? Finally, can you please test this again with the latest kernel (2.6.21) and see if the problem persists?
(In reply to comment #10) > Coupl'a questions for Joshua... > > Finally, can you please test this again with the latest kernel (2.6.21) and see > if the problem persists? > I think the last question suits me as well. It moved me to compile the newest gentoo kernel, and just now I have 2.6.21 running on my amd64 machine, without pci=nomsi, and everything looks OK. BTW, my problems after attempting kernel upgrade last time appeared to be caused by my power source dying. Of course I'll report if I encounter any problems later. And if there are any reports of persisting problems in 2.6.21, I'll send my config to compare.
Kernel 2.6.21.1 does NOT work without specifying "pci=nomsi". I will attach the dmesg output from booting into 2.6.21.1 with "pci=nomsi". Just in case I'm doing something stupid, let me explain what I did. I downloaded and extracted the 2.6.21.1 sources into /usr/src/linux-2.6.21.1/ Then I moved /usr/src/linux to point to that directory, and did "genkernel --oldconfig all". * Kernel compiled successfully! Then the entry in my grub.conf is: title=Gentoo Linux experimental root (hd0,4) kernel /kernel-genkernel-x86_64-2.6.21.1 root=/dev/ram0 init=/linuxrc ramdisk=81 92 real_root=/dev/sda7 initrd /initramfs-genkernel-x86_64-2.6.21.1 In order to boot successfully, I had to add "pci=nomsi" to the kernel parameters.
Created attachment 119469 [details] dmesg output from booting into 2.6.21.1
Regarding James' questions about i686, I believe Josh and I were able to successfully boot my amd64 laptop using the 2006.1 i686 Gentoo CD. I don't think we paid any attention to MSI then; iirc it worked "out of the box". I've since thrown the i686 disc away. Let me know if you think it would be worth investigating, and I will try it again, paying attention to MSI.
(In reply to comment #12) > Kernel 2.6.21.1 does NOT work without specifying "pci=nomsi". Do I understand correctly that it means vanilla sources? I should have said clearly that what I use are sys-kernel/gentoo-sources, not sys-kernel/vanilla-sources, let alone any sources version downloaded manually. After emerging sys-kernel/gentoo-sources-2.6.21 I compiled them manually (as a friend sais, ``make menuconfig, not war''). Perhaps our experience combined means gentoo sources have dealt with the AHCI/MSI problem, while vanilla sources have not yet. I'll attach my config and an excerpt of /var/log/messages describing the boot of my new kernel.
Created attachment 119477 [details] dmesg of kernel-2.6.21-gentoo booting A faked dmesg, actually an excerpt of /var/log/messages related to the last boot. Ellipses denote omitted quanta of APIC error mesages (I should get rid of them, but for now don't know how and hope they're mostly harmless). The third and fourth lines show I've actually used 2.6.21-gentoo and did not request pci=nomsi.
Created attachment 119479 [details] config-2.6.21-gentoo My 2.6.21-gentoo kernel config. Don't laugh at my choices too much, please.
You're right that I was using the kernel source directly from kernel.org (I didn't even get them from vanilla-sources). I just grabbed gentoo-sources-2.6.21, built and installed, and didn't have any more success that with the kernel.org sources, as far as I could see. I'd be happy to post the dmesg output if desired.
I think we have a fix for this. Please apply the following patch to 2.6.21 and try booting without pci=nomsi http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob_plain;f=queue-2.6.21/pci-quirks-disable-msi-on-rs400-200-and-rs480.patch;hb=05ab505f2909acf3a614d3e6a32271c4c1f8a69d Don't worry about vanilla vs gentoo. gentoo-sources doesn't include any fixes which might affect this particular issue, yet.
That kernel patch did the trick: price@neverland:/home/price>dmesg Linux version 2.6.21.1 (root@neverland) (gcc version 4.1.1 (Gentoo 4.1.1-r3)) #2 SMP Wed May 16 16:29:32 HST 2007 Command line: root=/dev/ram0 init=/linuxrc ramdisk=8192 real_root=/dev/sda7 ... PCI: MSI quirk detected. MSI deactivated. ... Thanks!
Fixed in gentoo-sources-2.6.21-r1 (genpatches-2.6.21-2)