I have compiled gentoo-dev-sources kernel 2.6.4-gentoo-r1 (as well as previous 2.6 kernels with the same results,) and included ext2, ext3, and reiserfs built in to the kernel, and yet on boot, the system refuses to mount my partitions on the slave drive (2 of them, both with ext3 format.) I get the message "/dev/hdb1 already mounted or /gamesvr busy" and "/dev/hdb2 already mounted or /other busy" when neither disk is mounted. The disks fsck clean, and there doesn't appear to be anything wrong with the hardware, and when I boot 2.4 kernels the drives mount with no problems. No other computers I run with 2.6.4 are having this problem, so it appears to be related to this particular hardware. Reproducible: Always Steps to Reproduce: 1. emerge gentoo-dev-sources 2. compile kernel 3. reboot Actual Results: vogon sbin # mount /dev/hda3 on / type reiserfs (rw,noatime) none on /tmp/.initrd/dev type devfs (rw) /dev/hda3 on / type reiserfs (rw,noatime) none on /proc type proc (rw) none on /sys type sysfs (rw) none on /dev type devfs (rw) none on /dev/pts type devpts (rw) /dev/md0 on /music type reiserfs (rw) /dev/md1 on /raidstk type ext3 (rw) none on /dev/shm type tmpfs (rw) none on /proc/bus/usb type usbfs (rw) vogon sbin # mount -a mount: /dev/hdb1 already mounted or /gamesvr busy mount: /dev/hdb2 already mounted or /other busy Expected Results: Mounted these two partitions with no problems. Couldn't find anything here about this particular problem while searching for "mount busy" and was able to find very little on the internet on this problem too. I figured a disk that wasn't mounted wouldn't produce an already mounted error. vogon etc # vi fstab /dev/hda1 /boot reiserfs notail,noauto,noatime 1 1 /dev/hda3 / reiserfs noatime 0 0 /dev/hda2 none swap sw 0 0 /dev/cdroms/cdrom0 /mnt/cdrom iso9660 noauto,ro 0 0 /dev/cdroms/cdrom1 /mnt/cdrw iso9660 noauto 0 0 /dev/hdb1 /gamesvr ext3 defaults 0 0 /dev/hdb2 /other ext3 defaults 0 0 /dev/md0 /music reiserfs defaults 0 0 /dev/md1 /raidstk ext3 defaults 0 0 I am running NFS, and am exporting one of these partitions if that helps. Kernel Config Attached.
Created attachment 28292 [details] Kernel Config file for kernel which is preventing mount.
Created attachment 28394 [details] dmesg output on affected machine It does not appear to be a format problem, as both partitions have now been formatted in ext2, ext3, and reiserfs formats, and the problem still isn't going away. Any guess as to whether this is an ACPI related issue? Maybe I should shut down ACPI and see if I still cannot touch those partitions. However, ACPI was working on the old 2.4 kernels.
Could you try 2.6.5? Have you had any success with disabling ACPI?
No dice on either removing ACPI or using 2.6.5. I've already backed up everything on /dev/hdb, so I'll probably just rip out the drive and replace it with another to see if the problem persists or is fixed by the second drive (fingers crossed.) If it does, I'll figure that the drive was bad (though it fscked clean every time,) as I've read some German sites (mostly in German, which I am not too terribly efficient at reading, but neither is babblefish,) where people were having the same problems and it appeared to be related to a bad disk (or at least a disk that the kernel thought was bad.) If it doesn't, I'll be back at square one for a while trying to troubleshoot where the kernel 2.4 and kernel 2.6 drivers differ which could be causing this unique problem. I'll let you know the outcome either way.
I installed a brand new (and known good) hard drive in the system, after removing the old one. The brand new drive was fdisk'd, then rebooted, and formatted with reiserfs. Same exact problem (mount says drive is busy or already mounted.) So it isn't a hard drive problem. Kernel 2.4 mounted the new drive fine, but 2.6 still is producing the same errors. I am beginning to believe the problem is in the Intel PIIX/ICH chipset support, or some option I am turning on in the config. I believe the machine uses the ICH4 chipset. I am really surprised that I am the only one seeing this problem, but I suspect that as more folks move to 2.6, we'll see this problem a lot more often. My next choice is to move one of the two CDROMs to the slave spot and see if I can mount the CDROM properly. If I can, then it is still something to do with the drives themselves, otherwise it is something to do with the drivers or configs. I know this is the price of bleeding edge, and any sane individual would just drop back to the 2.4 kernel...but I am willing to be the guinea pig to figure out why this isn't working and how to fix it (or just what chips aren't going to work.)
Are you using an initial ramdisk? If so, could you try to boot without it? Does the problem still exist if you upgrade to a 2.6.5 kernel?
I believe I tried it without an initial ramdisk, but yes I am using an initial ramdisk. And I am using 2.6.5 now, after removing the ACPI/APIC. I will try it again without the initial ramdisk with 2.6.5 and see if the problem goes away. For the record, the 2.4.22 had an initial ramdisk and it worked fine. However, if removing the initial ramdisk on 2.6.5 works, that will be great too.
I removed initrd from 2.6.5, and recompiled, then rebooted. The hard drive was still unaccessible, so it isn't initrd. Then, I removed everything I could think of except the ICH4/PIIX drivers and anything that was absolutely necessary to boot, then rebooted. The raid drivers weren't installed so none of the raid disks where mounted, and yet the second drive still wouldn't mount. I then moved the slave drive to the second master, and moved the cdroms to the slaves, then rebooted. Amazingly, the disk still didn't mount, even though it was in the master slot (yes, I moved the jumpers.) Then, I went through about 5 known good drives, and placed each of them into the computer, formatted them using ext2, and tried mounting them, and no dice. This is the weirdest problem I've ever seen, and it looks like the issue is purely incompatibility of my hardware with the ICH4 drivers in 2.6...though 2.4 works fine. I am not sure I'll be able to leave this machine in its current state (the drive contained my game servers) and will need to put the machine back online before my users revolt. (This isn't one of my work machines...)
Thanks to the help of Markus H
Thanks to the help of Markus Hästbacka (a Debian user who was seeing the same problem I was,) he found the problem and told me how to get rid of the problem, but it is still really an open bug...just not one that you guys are going to be able to solve. The error appears to be in IBM's lvm2 drivers (Device Mapper Support in the kernel), which somehow breaks the system and prevents all non-raid drives except the root and cdrom drives from mounting in certain chipsets. Removing the device mapper fixed the problem and I am now golden (Yippee!!!) Not sure why it wasn't fixed yesterday when I turned everything off, but I may have broken something else. I had originally turned on the Device Mapper support because I am using raid, but apparently that was not necessary as raid works perfectly fine without it. Hopefully anyone else seeing this same error will be able to find this bug report and use the information here to get out of that jam. If you want to close out this bug ticket, I believe it is "fixed" with this workaround.
Spoke too soon, I am afraid. I can mount the slave disk, but 2.6 needs lvm2 in order to access the raid stack. So I am once again between the rock and the hard place, but at least we know where the problem is.
Ok, can turn off lvm2 and use mdadm instead of raidstart, and everything works fine.
He found a solution. Excellent.