Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 217078 - booting with nosmp option causes CD-ROM to no longer work
Summary: booting with nosmp option causes CD-ROM to no longer work
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Release Media
Classification: Unclassified
Component: InstallCD (show other bugs)
Hardware: AMD64 Linux
: High normal
Assignee: Gentoo Release Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-10 02:02 UTC by Vincent van de Camp
Modified: 2009-01-09 17:56 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Vincent van de Camp 2008-04-10 02:02:35 UTC
When booting with the nosmp kernel option, the boot process will get stuck on a lost CD interrupt.

Reproducible: Always

Steps to Reproduce:
1.Boot from 2008.0 beta 1 livecd
2.enter at boot prompt "gentoo nosmp"
3.see actual results
Actual Results:  
ATIIXP: chipset revision 0
ATIIXP: not 100% native blah, will probe later
     ide0 BM-DMA at 0xf900-0xf907
BIOS setting: hda:DMA hdb:pio
ATIIXP: simplex device: DMA disabled
ide1: ATIIXP Bus-Master DMA disabled (BIOS)
hda: YAMAHA CRW2200E, ATAPI CD/DVD-ROM Drive
ide0 at 0x1f0-x1f7, 0x3f6 on irq 14
ide-cd: cmd 0x5a timed out
hda: lost interrupt
ide-cd: cmd 0x5a timed out
hda: lost interrupt
hda: ATAPI 40x CD-ROM CD-R/RW drive 8192 kB cache, UDMA(33)
UniformCD-ROM drive revision 3.20
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
hda: lost interrupt
...

Expected Results:  
Livecd should boot into non-SMP install environment

I was trying to get around bug 216789.

This happens also using the 2007.0 livecd, although the error message is slightly different, but still comes down to a lost interrupt.

Hardware:
Motherboard: Gigabyte GA-MA69VM-S2
CD-ROM: Yamaha CRW2200E CD-ROM/CD-RW drive
2GB 800MHz DDR2 memory
ATI RV370 graphics card
Comment 1 Andrew Gaffney (RETIRED) gentoo-dev 2008-04-10 02:04:28 UTC
Did you ever think that perhaps you've just got really crappy hardware?
Comment 2 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-10 02:13:35 UTC
I'm afraid that I would have to agree.  The issues that you're hitting aren't being reported by anybody else.  Have you verified that your RAM isn't failing?  After all, you're getting kernel paging errors, which pretty much means that it can't write kernel pages to virtual memory subsystem correctly.

Have you tried booting with any of the countless kernel command line options to try to troubleshoot this?  I see that you're trying nosmp.  Tried something like idedma=off, irqpoll, noapic, nolapic, nosmp?
Comment 3 Mike Doty (RETIRED) gentoo-dev 2008-04-10 02:44:24 UTC
I've seen stuff like this on failing IDE devices and failing IDE cards/chipsets.  Try also replacing the IDE cable, some of those 80 pin ones are realllllly cheap...
Comment 4 Vincent van de Camp 2008-04-10 05:19:39 UTC
Andrew & Chris, I may have thought it, but like I say in the other bug, when I have booted with my own kernel rather than the  kernel that someone threw together for the livecd, there are NO problems with kernel paging requests. At all. Ever. Not one. Also, I have let memtest86 run for two or three hours. Not one error in at least seven passes. Yeah, it just has to be really crappy hardware now, doesn't it!:)

Without the nosmp kernel option the livecd _never_ has any problem finding the CD-ROM. I don't know who created the nosmp kernel, exactly what's different from the smp one and how it's loaded exactly but if without the nosmp option the CD-ROM drive always gets found and with it the interrupt gets lost, then you tell me what's the most likely thing that's wrong.
Comment 5 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-10 05:54:21 UTC
There is no nosmp kernel.  You're simply triggering the kernel itself to not load in SMP.  Also, it's *very* likely that the problem is a buggy APIC which doesn't present itself in SMP.  Seriously, man, we're not out to get you or anything.  We're just trying to diagnose the problem.  No need to get so defensive.

Some things that I would check for:

BIOS updates
MPS version in BIOS
trying "nosmp nolapic noapic"
Comment 6 Vincent van de Camp 2008-04-10 17:49:41 UTC
I couldn't find any MPS information in the BIOS and in the manual. I did try nosmp nolapic noapic, and that does work. I have not yet been able to check whether using nosmp solves the kernel paging problem.
Comment 7 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-10 22:04:53 UTC
The BIOS may not have a MPS setting, since it is *technically* a single-CPU system.  Knowing that "nosmp nolapic noapic" works tells me that your motherboard likely *does* have a buggy APIC, which is generally disabled on a non-SMP kernel, and force-enabled on a SMP kernel.  That's why when you boot with "nosmp" without "noapic nolapic" you hit the problems you have.
Comment 8 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-12 17:48:10 UTC
So what would you like to do about this to resolve the bug?  You have a workaround, but everything mentioned here is kernel support, so it'd need to be changed upstream.  About the only thing that we could do, aside from making some kernel changes and hoping that upstream accepts them, is to document a recommendation in the Handbook to always use "noapic nolapic" with "nosmp".
Comment 9 Vincent van de Camp 2008-04-14 06:30:49 UTC
For the record, I did try an 80 pin cable and also a different CD/DVD player (Mad dog) and they also don't work with only the nosmp option. It doesn't even notify me about losing interrupts, or whatever else may be wrong with it.

Since there is a work around (try noapic/nolapic if nosmp alone doesn't work), it seems to me that a mention in the install docs might be helpful, but since it's now also documented on bugs.gentoo.org that may not even be necessary. I'll leave that up to the install/documentation people, but it looks to me like this bug can be closed.
Comment 10 Vincent van de Camp 2008-04-14 06:38:56 UTC
FYI, I tried to boot my own kernel with the nosmp option, and without the noapic and nolapic options it had no trouble immediately recognizing the CD/DVD player.
Comment 11 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-16 00:03:32 UTC
Ehh, it also isn't booting from the CD drive, unless you remastered the CD.  Is that what you did?  Sorry, you didn't provide much in the way of information, such as how you actually did this test, which kernel you used, what kernel options you may think are important, whether your kernel was even SMP, etc.
Comment 12 Vincent van de Camp 2008-04-16 02:48:04 UTC
This was with my SMP enabled 2.6.24-gentoo-r4 kernel (why would I use the nosmp option with a kernel that has no SMP compiled in?) off the hard drive, the kernel that also has no problem compiling large packages and that doesn't oops.

My point with the last remark was actually that I don't need the nolapic and noapic options for my own kernel to boot with the nosmp option. My kernel has APIC enabled, so to me that seems to make it unlikely that my motherboard's APIC is buggy (although it doesn't unequivocally rule it out of course). But it could mean that the kernel options used for the livecd don't handle APIC well. I don't remember what the kernel version is on the 2007.0 and 2008.0 beta 1 livecd's but I am curious to see what the 2008.0 beta 2 livecd will do.
Comment 13 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-16 20:27:56 UTC
*sigh*

No, you proved nothing.  The problem here was that when you used "nosmp" *ON THE RELEASE* that it was unable to find your optical drive.  Now, there are different code paths in genkernel for CD and non-CD boots.  Of course, you didn't even provide whether you're using genkernel or not, so I can't even say, but unless you mastered a new CD with your kernel, you didn't really help us diagnose anything with your own kernel, especially since you didn't even use the same version/revision as was used on the CD that you're reporting against.  You make very interesting statements like "My kernel has APIC enabled" without backing it up.  How do you know?  You do know that there isn't a kernel option for that, right?  Please don't confuse ACPI and APIC.

You also never specified which ISO you used.  The Minimal InstallCD and the LiveCD use different kernels and you've interchanged what you've called them, so I'm not even sure which kernel config that I should be checking.

At any rate, I'm just going to ask you to try the Beta 2 CD, when it hits the mirrors, and see if it is resolved for you.
Comment 14 Vincent van de Camp 2008-04-17 03:31:57 UTC
linux # grep APIC .config
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y

So there are at least three APIC options in .config. I found CONFIG_X86_UP_APIC while poking around on google, but that one is not in my .config, but it seems there are more than the three I have enabled in my kernel. Based on that I assumed that I have enabled APIC in the kernel. But if you say that those options aren't there, I stand corrected:)

You wouldn't by any chance have an approximate date for the release of beta 2?
Comment 15 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-18 19:55:35 UTC
Sorry, I should have been more clear since I wasn't expecting someone to be so pedantic.  There's no *selectable* option when using a SMP kernel.
Comment 16 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-18 19:55:57 UTC
Also, could you possibly answer some of the questions that I asked?

Thanks
Comment 17 Vincent van de Camp 2008-04-20 04:42:04 UTC
Sure, and could you please answer the most important one for now, when you expect the beta 2 cd to come out. I don't need an exact date, but is it going to be weeks or months if you know.

If I still haven't answered some of your questions in the text above, I may have assumed too much and I do apologize. Would you mind letting me know which of the questions you feel are still unanswered and whether you really need an answer to them before the beta 2 cd comes out, on which the new kernel options may have solved many, if not all of my problems?
Comment 18 Vincent van de Camp 2008-04-22 00:29:37 UTC
OK, the Gentoo livecd is off the hook. I just installed the kernel version that the 2008.0 beta 1 AMD64 minimal livecd uses (2.6.19-gentoo-r5) on my work station. When I boot that with the nosmp option, without the noapic and nolapic options, and it will again just mention the (correct) optical device but then lose interrupts and hang.

If the beta 2 livecd is using anything in the 2.6.23 or 2.6.24 range I do not expect there to be any problems.

I'm going out on a limb here and predict that if I should run that in smp mode (boot without nosmp or any other options) I will encounter the same problems that I have described in the other bug (216789). I don't have time at this moment to try that, but as soon as I find out I will update 216789.

Thank you for your patience;)
Comment 19 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-22 19:29:53 UTC
We don't have an updated date for Beta 2, nor will we even make a guess at this point.

I *was* trying to get this resolved prior to the Beta 2 release, but now we're just going to have to blindly hope that it works when the final 2008.0 comes out, since we will not have had a chance to test it.
Comment 20 Vincent van de Camp 2008-04-27 04:00:04 UTC
You don't think this is likely a kernel problem? The motherboard I'm using, to my knowledge, has a chipset that probably came out around 2.6.19, so it wouldn't surprise me if the kernel wasn't entirely ready for the hardware.

I'm not sure how you would be going to fix something on a livecd kernel image that  is no longer going to be used. However, if you have things that you want me to try to get deeper into this problem I'll be more than happy to tinker with it.
Comment 21 Martin Mokrejš 2008-04-30 00:22:20 UTC
Vincent, it would be nice if you could add the dmesg outputs captured while you boot with those different kernel version and/or different kernel command-line flags. You could pass ide0=xxx on your kernel command-line as well. Compare them with values set by BIOS.


Also, check firmware of your drive and motherboard:
http://www.yamaha.co.jp/english/product/computer/firmware/firmware.html
http://www.yamaha.co.jp/english/product/computer/firmware/fw-07.html

BIOS of the motherboard seems not known to have some bug in IDE/DMA:
http://www.gigabyte.com.tw/Support/Motherboard/BIOS_Model.aspx?ProductID=2500
Comment 22 Martin Mokrejš 2008-04-30 01:21:06 UTC
Actually, if I remember right 'smp' is incompatible with acpi, so try 'noacpi' kernel-command flag and see if it boots (should be same as 'acpi=off').
Comment 23 Chris Gianelloni (RETIRED) gentoo-dev 2008-04-30 19:52:17 UTC
Ehh, SMP has been ACPI compatible for a few years now.  ;]

I was simply going to suggest seeing if this is resolved with 2008.0 Beta 2.
Comment 24 Vincent van de Camp 2008-05-02 05:37:56 UTC
I saw that the 2008 beta2 CD images are out, so I tried to start the amd64 2008 beta2 minimal install livecd by editing the grub entry and add 'nosmp' to it (without noapic/nolapic) and see whether the optical drive is recognized correctly, but the kernel doesn't seem to do anything with the option. After booting /proc/cpuinfo still reports two CPUs. Is there another way to try to force nosmp using this livecd?

I did play around to see whether but #216789 still happens, but as I mentioned there, this does not seem to be the case.
Comment 25 Martin Mokrejš 2008-05-06 11:30:54 UTC
Please attach lspci output as well as full dmesg output. Would the kernel not recognize 'nosmp' option it would mention it at the very beginning of dmesg output.

Try to force use of generic IDE driver by adding 'ide-all-generic=1' to teh kernel command line and post dmesg output as well.
Comment 26 Andrew Gaffney (RETIRED) gentoo-dev 2008-07-12 16:53:46 UTC
Has this been resolved with the 2008.0 final media?
Comment 27 Vincent van de Camp 2009-01-09 17:56:12 UTC
(In reply to comment #26)
> Has this been resolved with the 2008.0 final media?
> 

Yep, it's all fixed now. (sorry for the delay...)