Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 79629 - gentoo-dev-sources 2.6.10-x fail to boot
Summary: gentoo-dev-sources 2.6.10-x fail to boot
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: High critical (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-01-26 12:54 UTC by Brian O'Reilly
Modified: 2005-03-16 10:07 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
requested kernel config (.config,32.92 KB, text/plain)
2005-02-01 11:56 UTC, Brian O'Reilly
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Brian O'Reilly 2005-01-26 12:54:52 UTC
since 2.6.9-gentoo-r12, the gentoo-dev-sources kernel source package fails to boot
my shuttle SN85G4 box, which has a silicon image SATA controller, and a seagate SATA disk.

The kernel generated from the 2.6.10 stream of dev sources starts to initialise the disk bus, and then hangs, the last message posted to the console is"
"applying Seagate errata fix"

I have posted this bug upstream, as directed in the header of $linux/drivers/scsi/sata_sil.c to both Jeff Garzik -> jgarzik@pobox.com and to linux-ide@vger.kernel.org as it seems freestanding relative to any gentoo patches, but I haven't actually heard anything back. Perhaps the gentoo kernel package dudes can find out something? I'd call this a showstopper, at least for me. =)

Reproducible: Always
Steps to Reproduce:
1. build gentoo-dev-sources on amd64 shuttle SN85G4
2. reboot
3. system is wedged initialising disk bus

Actual Results:  
System is wedged when rebooted

Expected Results:  
kernel should have booted, and initialised all peripherals.
Comment 1 Daniel Drake (RETIRED) gentoo-dev 2005-01-28 03:56:45 UTC
Please test 2.6.11_rc2
Comment 2 Brian O'Reilly 2005-01-28 15:54:44 UTC
gladly... however, there appears to be no ebuild for 2.6.11_rc2 available on my system... where do I get it?

-B
Comment 3 Daniel Drake (RETIRED) gentoo-dev 2005-01-29 09:41:22 UTC
development-sources-2.6.11_rc2
Comment 4 Brian O'Reilly 2005-01-30 12:54:31 UTC
I've compiled and installed the indicated source tree, and it still fails in exactly the same way, in exactly the same place.

-B
Comment 5 Daniel Drake (RETIRED) gentoo-dev 2005-01-30 13:10:21 UTC
When you say "since 2.6.9-gentoo-r12" do you mean that it worked in 2.6.9-r11 ?

Can you please post the output of "hdparm -i /dev/hda" where hda is the disk in question
Comment 6 Brian O'Reilly 2005-01-31 07:33:27 UTC
I worded my initial report obliquely.. I meant that 2.6.9-gentoo-r12 was the last kernel to boot this machine.

It's a SATA disk/controller, so the disk is modeled on the scsi chain:

infiltrator fade # hdparm -i /dev/scsi/host0/bus0/target0/lun0/disc

/dev/scsi/host0/bus0/target0/lun0/disc:
 HDIO_GET_IDENTITY failed: Operation not supported
Comment 7 Daniel Drake (RETIRED) gentoo-dev 2005-02-01 11:36:30 UTC
Please attach your kernel .config for 2.6.11-rc2
Comment 8 Brian O'Reilly 2005-02-01 11:56:33 UTC
Created attachment 50161 [details]
requested kernel config

This is the requested configuration for the 2.6.11-rc2 kernel which also fails
applying fix for seagate errata. this config is almost identical to the
2.6.9-gentoo-r12 config which works.
Comment 9 Brian O'Reilly 2005-02-08 08:54:12 UTC
just an update from my side of the bug:

linux-2.6.11-rc3 fails to compile at all with the config sent earlier.
linux-2.10-gentoo-r7 builds, but still halts applying seagate errata.

... I think the changes in the sata_sil.c file apply a fix to disks the fix is not applicable to. I'd appreciate any updates you guys could give regarding current status of this bug. =)

-Brian
Comment 10 Daniel Drake (RETIRED) gentoo-dev 2005-02-09 07:58:25 UTC
We really need to know if 2.6.11 helps. Please start from a blank .config (you should always do this when changing kernel versions) and see if it compiles. If it doesn't, post the compile error.
Comment 11 Brian O'Reilly 2005-02-14 11:34:41 UTC
I have now tried 2.6.11-rc[1-4] and wihle rc4 compiled realtively cleanly, it is still getting wedged at the point where the silicon image SATA controller is supposed to initialise the hard disk. The last message printed to screen is:
 "applying Seagate errata fix"

from drivers/scsi/sata_sil.c line 314 in the following clause:

        /* limit requests to 15 sectors */
        if (quirks & SIL_QUIRK_MOD15WRITE) {
                printk(KERN_INFO "ata%u(%u): applying Seagate errata fix\n",
                       ap->id, dev->devno);
                ap->host->max_sectors = 15;
                ap->host->hostt->max_sectors = 15;
                dev->flags |= ATA_DFLAG_LOCK_SECTORS;
                return;
        }

there were changes made to this area of the code as the source tree moved from 2.6.9-gentoo-r12 to the 2.6.10 dev sources and now the 2.6.11 sources I have been using at your direction... I believe that this errata is being applied to disks which it is not required to drive. Let me know if I can provide you with any more information.
Comment 12 Daniel Drake (RETIRED) gentoo-dev 2005-02-14 12:14:40 UTC
Just because it's the last message you see doesn't necessarily mean its causing the lockup. But its worth trying anyway, please post the output of "hdparm -i /dev/hda" (or whatever your drive is).
Comment 13 Brian O'Reilly 2005-02-14 15:54:24 UTC
er, I posted the output of that command earlier in this bugreport, but
here it is again:

infiltrator ~ # hdparm -i /dev/scsi/host0/bus0/target0/lun0/disc

/dev/scsi/host0/bus0/target0/lun0/disc:
 HDIO_GET_IDENTITY failed: Operation not supported

would you like to see the init order of the machine?
Comment 14 Daniel Drake (RETIRED) gentoo-dev 2005-03-16 06:18:49 UTC
Sorry, didn't notice that. Could you please see if 2.6.11 boots ok, and name the model of your hard disk (will start with ST....)
Comment 15 Brian O'Reilly 2005-03-16 10:07:28 UTC
I was still having trouble up until about ten seconds ago, but for completeness I'll tell the story here. :)

2.6.9-gentoo-r12 was the last kernel I could make boot. I've been following
the development kernels as per the policy on the amd64 wiki. It appeared that
the sata_sil.c driver was erroneously applying a fix that didn't apply to my
disk ( Seagate:  ST3160023AS ) so last week, just before 2.6.11.2 came out,
I commented my device out of the errata list in the driver source, and when
the machine failed to come up initialising the ATA bus, I resolved to post
that information back to the bug report... then I got very busy and forgot 
about it until your email today.

I was playing around with a freshly built 2.6.11.3 kernel, and I noticed that
I was still passing 'acapi=off' at boot time from grub. This was to side-
step a problem I was having before with sound stuttering. I removed the
parameter, and the kernel booted. I've been running on it now for the past
ten minutes or so. Even sound seems fine. This can probably be closed, although
I'd love to know what changed after 2.6.9-gentoo-r12 that caused the break. ;)

-B