Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 148485 - sys-apps/hdparm + baselayout parallel startup = IO errors
Summary: sys-apps/hdparm + baselayout parallel startup = IO errors
Status: RESOLVED NEEDINFO
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-21 03:40 UTC by Alon Bar-Lev (RETIRED)
Modified: 2007-10-06 21:05 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Alon Bar-Lev (RETIRED) gentoo-dev 2006-09-21 03:40:42 UTC
Hello,

Since hdparm modifies the disk parameters, processes that try to access the device at the same time fails.

So hdparm must be run in none parallel mode... I don't know if it is possible... But perhaps move it to an earlier stage of startup, since all it requires is the disk devices, and they should be available at early stage... Perhaps even before localmount?

hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: CHECK for good STATUS
hda: cache flushes supported
ide1: Speed warnings UDMA 3/4/5 is not functional.
Comment 1 SpanKY gentoo-dev 2006-09-21 07:06:52 UTC
not a bug in hdparm or baselayout; the kernel should be hiding these details
Comment 2 Alon Bar-Lev (RETIRED) gentoo-dev 2006-09-21 07:14:09 UTC
I don't think that the kernel should hide this.
The hdparm should be run before the device is mounted... Then you have no problems. If you use it when the root is mounted as readonly, there is a good chance that all will be OK.
Comment 3 Daniel Drake (RETIRED) gentoo-dev 2006-09-23 12:08:21 UTC
This is a kernel issue. hdparm should not be able to break things, no matter if the disk partitions aremounted or not etc.
Comment 4 Alon Bar-Lev (RETIRED) gentoo-dev 2006-09-23 13:05:21 UTC
Well... I don't agree. Modifying DMA is as basic as powering off the backend device, while the drive is mounted.
For example, USB storage: You power off the bus while a filesystem is mounted. Who do you expect the filesystem to behave?

I think that the hdparm setting should be executed before filesystems are mounted, in single user mode, so that the problem will not occur.
Comment 5 Bas Nedermeijer 2007-03-06 13:34:26 UTC
I have a similar problem. My highpoint controller isnt able to set the UDMA mode correct. So I am using hdparm to correct this. But when my partition was uncleanly umounted a diskcheck will occur. And because of the wrong udma mode it freezes, and corrupts the partition even further. And I need to use a live-cd to fix it first.

So i have now manually edited the 'checkfs' and 'checkroot' files to start hdparm first.
Comment 6 Steffen Bergner 2007-05-24 13:30:05 UTC
(In reply to comment #5)
> So i have now manually edited the 'checkfs' and 'checkroot' files to start
> hdparm first.
> 

I changed "/sbin/rc" from (added hdparm as first entry, line 169 on ~x86 latest):
---
get_critical_services() {
...
        else
                CRITICAL_SERVICES="checkroot modules checkfs localmount clock bootmisc"
        fi

to:
---

get_critical_services() {
...
        else
                CRITICAL_SERVICES="hdparm checkroot modules checkfs localmount clock bootmisc"
        fi

Now hdparm runs before writable remount of root path and doesn't trap to "dma disable, ide bus reset" (something similar) anymore.

Hope it helps.
Comment 7 Carlos Silva (RETIRED) gentoo-dev 2007-09-03 18:27:06 UTC
ok, can anyone test this on the latest stable kernel please?
Comment 8 Bas Nedermeijer 2007-09-04 01:33:38 UTC
(In reply to comment #7)
> ok, can anyone test this on the latest stable kernel please?
> 

What has changed? I am one of the reporters of this bug, but the IO corruption has costed me one time my entire partition.
And the suggested fix mentioned before works for me.

But maybe there is a way to test it without forcing a fsck?  BTW, i am running now    2.6.22-gentoo-r1
Comment 9 Steffen Bergner 2007-09-06 12:14:55 UTC
(In reply to comment #7)
> ok, can anyone test this on the latest stable kernel please?
> 

running: 2.6.22-gentoo-r2

I still have to use modified "/sbin/rc" to enter the dma-mode before any write-action of the ide-harddisk.

failing with hdparm in normal runlevel startup process:
EXT3 FS on hda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 506512k swap on /dev/hda2.  Priority:-1 extents:1 across:506512k
hda: selected mode 0x45
hda: dma_intr: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
hda: set_drive_speed_status: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
ide0: reset: success
hda: CHECK for good STATUS
hda: cache flushes not supported
e100: eth0: e100_watchdog: link up, 100Mbps, half-duplex


running perfectly as critical service before remount of any partition:
IA-32 Microcode Update Driver: v1.14a <tigran@aivazian.fsnet.co.uk>
hda: selected mode 0x45
hda: cache flushes not supported
kjournald starting.  Commit interval 5 seconds
EXT3 FS on hda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
Adding 506512k swap on /dev/hda2.  Priority:-1 extents:1 across:506512k
e100: eth0: e100_watchdog: link up, 100Mbps, half-duplex
NET: Registered protocol family 17

my /etc/conf.d/hdparm contains:
hda_args="-f -c3 -D1 -M254 -d1 -u1 -W1 -A1 -a248 -m16 -k1 -S60 -X69"

harddisk: hda: ST340810A

ide controller: 00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 Controller (rev 02)
Comment 10 Steffen Bergner 2007-09-07 07:59:57 UTC
(In reply to comment #9)
> (In reply to comment #7)
> > ok, can anyone test this on the latest stable kernel please?
> > 
> 
> running: 2.6.22-gentoo-r2
> 

running: 2.6.22-gentoo-r6

same as above! hdparm is still not useful for dma/on switch of disks in "write use/mode"

hdparm v7.7
Comment 11 Maarten Bressers (RETIRED) gentoo-dev 2007-09-22 00:56:00 UTC
We should really run some tests to see if this problem happens only during the boot sequence, or if it's a more generic issue.

Bas, I understand you're not eager to do so after losing data. Alon / Steffen, is either one of you willing to do this? Something like this would be involved:

Using the latest development kernel (2.6.23-rc7 as of this writing), make sure you use the new libata driver, not the old IDE one (CONFIG_ATA=y, CONFIG_IDE not set), mount a partition without turning on DMA on the disk, run eg. updatedb in the background (anything that generates disk IO), then turn on DMA.

We would then like the complete dmesg output, the kernel .config, and some info on the hardware used (harddrive, controller, motherboard).

Note that we can't guarantee there will be no data loss, so only do this if you're comfortable with it (or have backups).
Comment 12 Daniel Drake (RETIRED) gentoo-dev 2007-09-22 12:17:07 UTC
I suggest you actually do those 2 steps seperately. First see if you can reproduce the original problem on the libata pata drivers at all...
Comment 13 András 2007-09-22 12:26:14 UTC
I use libata ata_piix since ck-sources-2.6.22_p1 and haven't seen this message any more.
Comment 14 Alon Bar-Lev (RETIRED) gentoo-dev 2007-09-22 16:58:21 UTC
But the libata is still experimental...
Comment 15 Maarten Bressers (RETIRED) gentoo-dev 2007-09-22 17:28:32 UTC
Yes, it's still marked experimental for PATA drives, however these are the new drivers for both PATA and SATA. Most of the new development in this area goes into libata, whereas the old IDE subsystem will eventually become deprecated. 
That being said, if you're not comfortable using them, you don't have to ofcourse.  We do however still need some test data...
Comment 16 Maarten Bressers (RETIRED) gentoo-dev 2007-10-06 21:05:40 UTC
If/when anybody can provide some test data as requested, please reopen.