Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug
Bug#: 161448
Alias:
Product:
Component:
Status: RESOLVED
Resolution: FIXED
Assigned To: Daniel Drake <dsd@gentoo.org>
Hardware:
OS:
Version:
Priority:
Severity:
Reporter: Joshua Hoblitt <j_gentoo@hoblitt.com>
Add CC:
CC:
Remove selected CCs
URL:
Summary:
Status Whiteboard:
Keywords:

Filename Description Type Creator Created Size Actions
lspci.txt The output of lspci -vvv text/plain Joshua Hoblitt 2007-01-11 01:06 0000 11.69 KB Details
dmesg.out dmesg output from booting into 2.6.20-rc6 text/plain Paul Price 2007-01-26 19:47 0000 16.92 KB Details
kernel_2_6_21_1.dmesg dmesg output from booting into 2.6.21.1 text/plain Paul Price 2007-05-16 20:07 0000 17.35 KB Details
dmesg.start dmesg of kernel-2.6.21-gentoo booting text/plain Honza Macháček 2007-05-16 20:58 0000 39.52 KB Details
config-2.6.21-gentoo config-2.6.21-gentoo text/plain Honza Macháček 2007-05-16 21:01 0000 50.60 KB Details
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 161448 depends on: Show dependency tree
Bug 161448 blocks:
Votes: 0    Show votes for this bug    Vote for this bug

Additional Comments: (this is where you put emerge --info)


Not eligible to see or edit group visibility for this bug.






View Bug Activity   |   Format For Printing   |   XML   |   Clone This Bug


Description:   Opened: 2007-01-11 01:05 0000
Someone in my office just got a Dell Insperon 1501 that sports an ATI AHCI sata
chipset.  We quickly discovered that Ubuntu x86-64 6.06 & 6.10 absolutely
wouldn't report a block decide for the SATA harddisk, even after loading AHCI
(and all of the other sata drivers) by hand.  While the 'parted' livecd could
see the disk after loading 'ahci' by hand.  It turns out that Gentoo
2006.1/amd64 also could not see the disk, even with all of the sata drivers
loaded manually.  However, Gentoo 2006.1/i686 has no problem seeing the device.
 I'm wondering if this is some sort of 64bit portability issue.

Reproducible: Always

Steps to Reproduce:
1.Boot a amd64 build of 2.6.15 - 2.6.19

------- Comment #1 From Joshua Hoblitt 2007-01-11 01:06:23 0000 -------
Created an attachment (id=106468) [details]
The output of lspci -vvv

------- Comment #2 From Paul Price 2007-01-13 01:04:53 0000 -------
I am the afore-mentioned luser.  I was able to successfully boot the Gentoo
amd64 LiveCD using the boot options "irqpoll pci=nomsi" as suggested on some
web pages (I have sinced completed the install and am writing from the
troublesome machine!).  Scanning Google results, I believe the "irqpoll" is
required because I had a USB mouse plugged in when trying to boot, while the
"pci=nomsi" option was needed to recognise the HD.

------- Comment #3 From Paul Price 2007-01-17 22:19:15 0000 -------
See http://forums.gentoo.org/viewtopic-p-3849437.html

------- Comment #4 From Daniel Drake 2007-01-25 22:25:54 0000 -------
Still reproducible on the latest testing kernel, currently 2.6.20-rc6?

------- Comment #5 From Paul Price 2007-01-25 23:59:29 0000 -------
Built and installed 2.6.20-rc6 (only my second ever kernel build, so please
bear with me if I'm doing something stupid).

Booting with no kernel options except doscsi produces:

ahci 0000:00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci 0000:00:12.0: flags 64bit ncq ilck pm led clo pmp pio slum part
ata1: SATA max UDMA/133 cmd 0xFFFFC20000016180 ctl 0x0 bmdma 0x0 irq 1277
ata2: SATA max UDMA/133 cmd 0xFFFFC20000016180 ctl 0x0 bmdma 0x0 irq 1277
ata3: SATA max UDMA/133 cmd 0xFFFFC20000016200 ctl 0x0 bmdma 0x0 irq 1277
ata4: SATA max UDMA/133 cmd 0xFFFFC20000016200 ctl 0x0 bmdma 0x0 irq 1277
scsi : ahci
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[HANGS for a bit]
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[HANGS for a bit]
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[HANGS for a bit]
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
scsi1 : ahci
ata2: SATA link down (SStatus 0 SControl 300)
scsi2 : ahci
ata3: SATA link down (SStatus 0 SControl 300)
scsi3 : ahci
ata4: SATA link down (SStatus 0 SControl 300)
[...]
!! Block device /dev/sda7 is not a valid root device...
!! The root block device is unspecified or not detected.


Using boot option "pci=nomsi", I was able to boot succesfully (though I saw
some of the messages above flash by).

Option irqpoll is not required.

------- Comment #6 From Daniel Drake 2007-01-26 14:24:54 0000 -------
Please attach dmesg output from successful 2.6.20-rc6 boot with pci=nomsi

------- Comment #7 From Paul Price 2007-01-26 19:47:21 0000 -------
Created an attachment (id=108228) [details]
dmesg output from booting into 2.6.20-rc6

------- Comment #8 From Honza Macháček 2007-01-29 22:08:32 0000 -------
(In reply to comment #5)
> Using boot option "pci=nomsi", I was able to boot succesfully

Thank you, pci=nomsi saved my day with VT8251 and 2.6.19-gentoo-r4.

The last kernel I was able to use before applying pci=nomsi was
2.6.17-gentoo-r6.

Some messages from /var/log/messages:

1) 2.6.17-gentoo-r6

libata version 1.20 loaded.
ahci 0000:00:0f.0: version 1.2
acpi_bus-0201 [01] bus_set_power         : Device is not power manageable
GSI 18 sharing vector 0xC9 and IRQ 18
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 21 (level, low) -> IRQ 201
ahci 0000:00:0f.0: AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci 0000:00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part 
ata1: SATA max UDMA/133 cmd 0xFFFFC20000006D00 ctl 0x0 bmdma 0x0 irq 209
ata2: SATA max UDMA/133 cmd 0xFFFFC20000006D80 ctl 0x0 bmdma 0x0 irq 209
ata3: SATA max UDMA/133 cmd 0xFFFFC20000006E00 ctl 0x0 bmdma 0x0 irq 209
ata4: SATA max UDMA/133 cmd 0xFFFFC20000006E80 ctl 0x0 bmdma 0x0 irq 209
ata1: SATA link up 3.0 Gbps (SStatus 123)
ata1: dev 0 cfg 49:2f00 82:746b 83:7f01 84:4023 85:7469 86:3c01 87:4023 88:80ff
ata1: dev 0 ATA-7, max UDMA7, 488397168 sectors: LBA48
ata1: dev 0 configured for UDMA/133
scsi0 : ahci
ata2: SATA link up 3.0 Gbps (SStatus 123)
ata2: dev 0 cfg 49:2f00 82:346b 83:7d01 84:4023 85:3469 86:3c01 87:4023 88:407f
ata2: dev 0 ATA-7, max UDMA/133, 625142448 sectors: LBA48
ata2: dev 0 configured for UDMA/133
scsi1 : ahci
ata3: SATA link down (SStatus 0)
scsi2 : ahci
ata4: SATA link down (SStatus 0)
scsi3 : ahci

2) 2.6.19-gentoo-r4 without pci=nomsi

ahci 0000:00:0f.0: version 2.0
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 21 (level, low) -> IRQ 21
ahci 0000:00:0f.0: AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci 0000:00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part 
ata1: SATA max UDMA/133 cmd 0xFFFFC20000006D00 ctl 0x0 bmdma 0x0 irq 318
ata2: SATA max UDMA/133 cmd 0xFFFFC20000006D80 ctl 0x0 bmdma 0x0 irq 318
ata3: SATA max UDMA/133 cmd 0xFFFFC20000006E00 ctl 0x0 bmdma 0x0 irq 318
ata4: SATA max UDMA/133 cmd 0xFFFFC20000006E80 ctl 0x0 bmdma 0x0 irq 318
scsi0 : ahci
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
APIC error on CPU0: 00(08)
APIC error on CPU1: 00(08)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x104)
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ata1: COMRESET failed (device not ready)
ata1: hardreset failed, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ata1: COMRESET failed (device not ready)
ata1: hardreset failed, retrying in 5 secs
ata1: port is slow to respond, please be patient (Status 0x80)
ata1: port failed to respond (30 secs, Status 0x80)
ata1: COMRESET failed (device not ready)
ata1: reset failed, giving up
scsi1 : ahci
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
APIC error on CPU1: 08(08)
APIC error on CPU0: 08(08)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104)
ata2: port is slow to respond, please be patient (Status 0x80)
ata2: port failed to respond (30 secs, Status 0x80)
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: port is slow to respond, please be patient (Status 0x80)
ata2: port failed to respond (30 secs, Status 0x80)
ata2: COMRESET failed (device not ready)
ata2: hardreset failed, retrying in 5 secs
ata2: port is slow to respond, please be patient (Status 0x80)
ata2: port failed to respond (30 secs, Status 0x80)
ata2: COMRESET failed (device not ready)
ata2: reset failed, giving up
scsi2 : ahci
ata3: SATA link down (SStatus 0 SControl 300)
scsi3 : ahci
ata4: SATA link down (SStatus 0 SControl 300)

3) 2.6.19-gentoo-r4 pci=nomsi

ahci 0000:00:0f.0: version 2.0
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 21 (level, low) -> IRQ 21
ahci 0000:00:0f.0: AHCI 0001.0000 32 slots 4 ports 3 Gbps 0xf impl SATA mode
ahci 0000:00:0f.0: flags: 64bit ncq pm led clo pmp pio slum part 
ata1: SATA max UDMA/133 cmd 0xFFFFC20000006D00 ctl 0x0 bmdma 0x0 irq 21
ata2: SATA max UDMA/133 cmd 0xFFFFC20000006D80 ctl 0x0 bmdma 0x0 irq 21
ata3: SATA max UDMA/133 cmd 0xFFFFC20000006E00 ctl 0x0 bmdma 0x0 irq 21
ata4: SATA max UDMA/133 cmd 0xFFFFC20000006E80 ctl 0x0 bmdma 0x0 irq 21
scsi0 : ahci
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7, max UDMA7, 488397168 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 16
ata1.00: configured for UDMA/133
scsi1 : ahci
ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata2.00: ATA-7, max UDMA/133, 625142448 sectors: LBA48 NCQ (depth 0/32)
ata2.00: ata2: dev 0 multi count 16
ata2.00: configured for UDMA/133
scsi2 : ahci
ata3: SATA link down (SStatus 0 SControl 300)
scsi3 : ahci
ata4: SATA link down (SStatus 0 SControl 300)

------- Comment #9 From Honza Macháček 2007-02-16 10:51:38 0000 -------
(In reply to comment #8)
> Thank you, pci=nomsi saved my day with VT8251 and 2.6.19-gentoo-r4.

  Well, I celebrated too early. What followed was quite catastrophic and
cryptic to me; might be connected as well as not.

  The mayhem started inconspicuously with the eth0 interface ceasing apparently
from existence in mid-run without my intervention. After I had realised the
interface had been really out some queer way, not willing to be brought up, I
rebooted my machine -- unfortunately I don't remember, if I managed that time
to start the computer or if it froze immediately during the first reboot.
Anyway it froze in the end and refused to boot at all due to my root filesystem
having gone away.

  I didn't rescue my reiserfs 3.6 even by reiserfsck --rebuild-sb, the new
superblock was at least half nonsense (for example the hash function not set),
and subsequent reiserfsck --rebuild-tree brought my filesystem back to life
with a lot of files missing or corrupted.

  I booted into the tried 2.6.27-r6 kernel and more or less reinstalled the
system. May be less -- though I thought I had reinstalled it whole --, because
after some time I had it frozen again. Perhaps, having compiled the gentoo
2.6.20 kernel in the meantime, I should not have tried to boot into it, but I
cannot exclude the possibility that the new corruption of my filesystem
happened when it froze already. Anyway I am reinstalling my system again and
considering a purchase of a new disk to try new installation from scratch
(while keeping all my not yet lost data aside).

  If I rescue anything like a log and manage to find there anything looking
relevant, I'll report the details, of course -- whether here or in a bugreport
of its own.

------- Comment #10 From James 2007-05-01 03:11:30 0000 -------
Coupl'a questions for Joshua...

- You mention that i686 kernel works, while x86_64 kernel doesn't.  Are these
two kernels the same version?
- Turning off MSI on x86_64 seems to be helping some people with the issue you
described.  Is the problem reproducible with MSI interrupts *enabled* on i686,
or do the problems persist with MSI interrupts enabled on the i686?

Finally, can you please test this again with the latest kernel (2.6.21) and see
if the problem persists?

------- Comment #11 From Honza Macháček 2007-05-14 19:02:33 0000 -------
(In reply to comment #10)
> Coupl'a questions for Joshua...
> 
> Finally, can you please test this again with the latest kernel (2.6.21) and see
> if the problem persists?
> 

I think the last question suits me as well. It moved me to compile the newest
gentoo kernel, and just now I have 2.6.21 running on my amd64 machine, without
pci=nomsi, and everything looks OK.

BTW, my problems after attempting kernel upgrade last time appeared to be
caused by my power source dying.

Of course I'll report if I encounter any problems later. And if there are any
reports of persisting problems in 2.6.21, I'll send my config to compare.

------- Comment #12 From Paul Price 2007-05-16 20:06:56 0000 -------
Kernel 2.6.21.1 does NOT work without specifying "pci=nomsi".

I will attach the dmesg output from booting into 2.6.21.1 with "pci=nomsi".

Just in case I'm doing something stupid, let me explain what I did.
I downloaded and extracted the 2.6.21.1 sources into /usr/src/linux-2.6.21.1/
Then I moved /usr/src/linux to point to that directory, and did "genkernel
--oldconfig all".
* Kernel compiled successfully!

Then the entry in my grub.conf is:

title=Gentoo Linux experimental
root (hd0,4)
kernel /kernel-genkernel-x86_64-2.6.21.1 root=/dev/ram0 init=/linuxrc
ramdisk=81
92 real_root=/dev/sda7
initrd /initramfs-genkernel-x86_64-2.6.21.1

In order to boot successfully, I had to add "pci=nomsi" to the kernel
parameters.

------- Comment #13 From Paul Price 2007-05-16 20:07:45 0000 -------
Created an attachment (id=119469) [details]
dmesg output from booting into 2.6.21.1

------- Comment #14 From Paul Price 2007-05-16 20:14:29 0000 -------
Regarding James' questions about i686, I believe Josh and I were able to
successfully boot my amd64 laptop using the 2006.1 i686 Gentoo CD.  I don't
think we paid any attention to MSI then; iirc it worked "out of the box".  I've
since thrown the i686 disc away.  Let me know if you think it would be worth
investigating, and I will try it again, paying attention to MSI.

------- Comment #15 From Honza Macháček 2007-05-16 20:50:25 0000 -------
(In reply to comment #12)
> Kernel 2.6.21.1 does NOT work without specifying "pci=nomsi".

Do I understand correctly that it means vanilla sources?

I should have said clearly that what I use are sys-kernel/gentoo-sources, not
sys-kernel/vanilla-sources, let alone any sources version downloaded manually.
After emerging sys-kernel/gentoo-sources-2.6.21 I compiled them manually (as a
friend sais, ``make menuconfig, not war'').

Perhaps our experience combined means gentoo sources have dealt with the
AHCI/MSI problem, while vanilla sources have not yet.

I'll attach my config and an excerpt of /var/log/messages describing the boot
of my new kernel.

------- Comment #16 From Honza Macháček 2007-05-16 20:58:52 0000 -------
Created an attachment (id=119477) [details]
dmesg of kernel-2.6.21-gentoo booting

A faked dmesg, actually an excerpt of /var/log/messages related to the last
boot. Ellipses denote omitted quanta of APIC error mesages (I should get rid of
them, but for now don't know how and hope they're mostly harmless). The third
and fourth lines show I've actually used 2.6.21-gentoo and did not request
pci=nomsi.

------- Comment #17 From Honza Macháček 2007-05-16 21:01:56 0000 -------
Created an attachment (id=119479) [details]
config-2.6.21-gentoo

My 2.6.21-gentoo kernel config. Don't laugh at my choices too much, please.

------- Comment #18 From Paul Price 2007-05-16 23:04:33 0000 -------
You're right that I was using the kernel source directly from kernel.org (I
didn't even get them from vanilla-sources).
I just grabbed gentoo-sources-2.6.21, built and installed, and didn't have any
more success that with the kernel.org sources, as far as I could see.  I'd be
happy to post the dmesg output if desired.

------- Comment #19 From Daniel Drake 2007-05-17 02:16:19 0000 -------
I think we have a fix for this. Please apply the following patch to 2.6.21 and
try booting without pci=nomsi

http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git;a=blob_plain;f=queue-2.6.21/pci-quirks-disable-msi-on-rs400-200-and-rs480.patch;hb=05ab505f2909acf3a614d3e6a32271c4c1f8a69d

Don't worry about vanilla vs gentoo. gentoo-sources doesn't include any fixes
which might affect this particular issue, yet.

------- Comment #20 From Paul Price 2007-05-17 02:55:02 0000 -------
That kernel patch did the trick:

price@neverland:/home/price>dmesg
Linux version 2.6.21.1 (root@neverland) (gcc version 4.1.1 (Gentoo 4.1.1-r3))
#2
 SMP Wed May 16 16:29:32 HST 2007
Command line: root=/dev/ram0 init=/linuxrc ramdisk=8192 real_root=/dev/sda7
...
PCI: MSI quirk detected. MSI deactivated.
...


Thanks!

------- Comment #21 From Daniel Drake 2007-05-19 05:31:38 0000 -------
Fixed in gentoo-sources-2.6.21-r1 (genpatches-2.6.21-2)

Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug