Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 271590 - Kernel fails at init
Summary: Kernel fails at init
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: AMD64 Linux
: High blocker with 1 vote (vote)
Assignee: Gentoo Kernel Bug Wranglers and Kernel Maintainers
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-28 22:27 UTC by Lóránt Farkas
Modified: 2009-06-28 09:47 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
conifg for kernel 2.6.29-gentoo-r5 (config-2.6.29-gentoo-r5,78.14 KB, text/plain)
2009-05-28 22:29 UTC, Lóránt Farkas
Details
config for the (working) 2.6.25 kernel (config-2.6.25-reiser4-r9,78.12 KB, text/plain)
2009-05-28 22:29 UTC, Lóránt Farkas
Details
config of the (working) kernel (config-2.6.25-reiser4-r9.old,78.12 KB, text/plain)
2009-05-29 14:08 UTC, Lóránt Farkas
Details
working dmesg (working.dmesg,45.41 KB, text/plain)
2009-05-29 14:10 UTC, Lóránt Farkas
Details
working config for 2.6.29-gentoo-r5 (config-2.6.29-gentoo-r5,70.75 KB, text/plain)
2009-06-27 07:05 UTC, Lóránt Farkas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Lóránt Farkas 2009-05-28 22:27:39 UTC
I've upgraded my 2.6.25-r9 gentoo-sources kernel to 2.6.29-r5. The kernel loads, but after mouting the root (readonly) Gives the message:

Kernel panic - not syncing No init found. Try passing init = option to kernel

It loads and works with the old kernel. I've runned fsck but found nothing.
The same happens with 2.6.28-r6

If I try to boot with init=/sbin/init or /sbin/sh

Then I get:
cannot execute /sbin/init (or /sbin/sh) 
And then try to bhave as default, and get the same error

Kernel panic - not syncing No init found. Try passing init = option to kernel

Reproducible: Always



Expected Results:  
After the mount the kernel should execute init
Comment 1 Lóránt Farkas 2009-05-28 22:29:01 UTC
Created attachment 192769 [details]
conifg for kernel 2.6.29-gentoo-r5
Comment 2 Lóránt Farkas 2009-05-28 22:29:49 UTC
Created attachment 192771 [details]
config for the (working) 2.6.25 kernel
Comment 3 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-05-29 06:34:59 UTC
1. The "working" .config you included is from 2.6.29-gentoo-r5, read the top of the file. I think you just copied the wrong one. Please attach the correct file and mark the other one as obsolete.
2. please capture your kernel boot log output (serial console recommended) for the failed boot, and your full dmesg right after boot from the working kernel.
Comment 4 Lóránt Farkas 2009-05-29 14:08:10 UTC
Created attachment 192884 [details]
config of the (working) kernel
Comment 5 Lóránt Farkas 2009-05-29 14:10:16 UTC
Created attachment 192885 [details]
working dmesg
Comment 6 Lóránt Farkas 2009-05-29 14:19:32 UTC
Sorry, but I can't attach a serial console to my home desktop. But at the start, and at the end, there is no error message such the root partition cannot be mounted. It says it is mounted (readonly) kjournald starting (delay the write 5s). Then the above error: no init found.

I've a new videocard so I've needed a new kernel. I've a new 2.6.25 kernel, and today if i start the 2.6.29 kernel after the startup with the new (2.6.25) kernel  It didn't says anything, simply freez.
Comment 7 Daniel Drake (RETIRED) gentoo-dev 2009-05-29 16:35:42 UTC
It's possible that sda and sdb are being swapped. can you check by looking carefully at the boot messages on the failing kernel?
You can see which one is which in your working dmesg.. each disk has a different product number and size

scsi 0:0:0:0: Direct-Access     ATA      HDT722520DLA380  V44O PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 390721968 512-byte hardware sectors (200050 MB)
scsi 2:0:0:0: Direct-Access     ATA      Hitachi HDP72502 GM2O PQ: 0 ANSI: 5
sd 2:0:0:0: [sdb] 488395055 512-byte hardware sectors (250058 MB)
Comment 8 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-05-29 19:05:16 UTC
I'm in agreement with Daniel here. The devices have probably swapped order. You aren't seeing it fail to mount because the mount IS working, but to a device that doesn't have /bin/init, because it was your secondary drive. The async boot stuff in recent kernels DOES make it possible for the order to be different.

If it does happen that you need to change the order, you'd have to change fstab. To avoid this, I suggest changing the driver for the secondary disk to be a module, so that /dev/sda is fixed at being a specific disk. Alternatively start using labels or uuids with an initramfs.

Please test changing the root= argument, and reopen the bug as needed.
Comment 9 Lóránt Farkas 2009-05-31 07:48:10 UTC
No there were no swap. I've tried to start with hda2, hdb2 sdb2 etc. I've get the message (with e.g. sdb2) that there are parititions which can be mount as root such:
hda1, hda2, hda3, hda4, hdb - cannot be mounted because it is a cd/dvd drive (it is diagnosed by the kernel) sda1, sda2 (the size is match with the real sda2), sda3, sda4
So i cannot mount sdb2. And If i try hda2 the kernal says: I cannot mount (because it is a swap partition)

There is no swap. The kernel mounts the partition and caanot execute a file on ot.
Comment 10 Lóránt Farkas 2009-05-31 07:49:27 UTC
Sry. Cannot execute a file on it.
Comment 11 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-05-31 21:05:31 UTC
We really need the kernel log from the failing case then. Either find a serial cable, or explore netconsole. I don't know how workable the console-over-firewire is these days, but console-over-USB-serial should work.
Comment 12 Lóránt Farkas 2009-06-01 07:13:44 UTC
Could you give me a link about Console -over-USB (or firewire)?
I found a desciption on the internet but I make out that I need  (kermit - minicom) runnig to get console over USB. 

Will it run if the kernel couldn't execute anything on the drive?

Or put in an initrd? Will it run before init?
Comment 13 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-06-01 18:10:07 UTC
Rebuild the kernel, compiling into the kernel the following:
- USB host controller
- CONFIG_USB_SERIAL
- The right USB-serial hardware support.
After you have the above selected as built-in, there is a new option:
CONFIG_USB_SERIAL_CONSOLE
Enabling that new option, recompile and install the new kernel.

Boot with the following added to your kernel parameters:
console=tty0 console=ttyUSB0,115200n8

Does your system not have a serial port or something (and you've got no PCI cards that give you a serial port)?

I forgot that serial-console-over-firewire never made it to the final Linux kernel from an external module (it is in freebsd however).

Alternatively, do you have a digital camera? You could try to snap each page after the output stops (using the pageup to start from the top). If you take them at high resolution and fully aligned, the OCR stuff in the tree is usually able to convert it back to text (and you can do the manual corrections). This is probably more time-consuming and error-prone than finding some serial rig.
Comment 14 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-06-01 18:11:11 UTC
the initramfs is also maybe an idea, if you built a custom busybox with stuff to send the dmesg capture somewhere else over the network.
Comment 15 Lóránt Farkas 2009-06-01 20:32:51 UTC
First of all, I'm not in the IT. I have lot of work and two children so I will have time for that at later this week. (Now I'm using 2.6.25 kernel with other config)

> "Boot with the following added to your kernel parameters:
console=tty0 console=ttyUSB0,115200n8"

Thanks. I will try. I have an old laptop with Xubuntu. It will be a good console I think.

> "Alternatively, do you have a digital camera? You could try to snap each page
after the output stops (using the pageup to start from the top). If you take
them at high resolution and fully aligned, the OCR stuff"

Yes, I have. I think I've OCR too. But the batteries are depleted today so again I can try it later.

> "the initramfs is also maybe an idea, if you built a custom busybox with stuff
to send the dmesg capture somewhere else over the network."

Probably I could solve it (again I'm not in the IT), but I haven't enough time for it. 

I will write again, if I have the log. (And yes I will have the log) 

Thanks.
Comment 16 Roland Baldenhofer 2009-06-05 16:00:54 UTC
Hello,

Don't know if it helps you.
I solved the problem by creating a new Kernel.
mv /usr/src/linux/.config /usr/src/linux/.config_bak
mv /usr/src/linux/.config.old /usr/src/linux/.config.old_bak
Then I created with
make menuconfig
the new .config file.
Problem: You have to configure the complete kernel from the scratch.
I think between 2.6.25 and 2.6.29 too much changes happen in .config file.

Cheers

Roland
Comment 17 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2009-06-05 18:12:28 UTC
roland:
You can use make oldconfig. That's not the issue here at all, you can diff Lorant's .config files he provided, they are nearly identical. The ONLY difference is:
2512c2512,2513
< # CONFIG_SOUND_PRIME is not set
---
> CONFIG_SOUND_PRIME=y
> # CONFIG_SOUND_OSS is not set

(unless he uploaded the wrong .configs again).
Comment 18 Roman Krylov 2009-06-19 09:55:42 UTC
I have similar error.
Tried 'root=UUID=xxx-yyy-zzz-...' instead of /dev/sd... - did not help.
Comment 19 Stephan Hansen 2009-06-20 11:27:05 UTC
I have exactly the same problems. I didn't write them here before cause I thought, it's a stupid mistake in the .config by me cause current Live-Gentoos work. I still believe it's my mistake, but after the previous messages...

Perhabs this information helps. gentoo-sources-2.6.26-r4 works fine, I can configure it any serious way and it boots up. After this release, every newer kernel I tried (2.6.27-r8, 2.6.28-r5 and finally 2.6.29-r5) produced this error message, no matter what I gave to the init=-parameter. I have only a single HardDisk in my notebook(Thinkpad T61, amd64 / nocona, DualCore). sda1 is /boot, sda2 is /, sda3 is hibernate (not yet working :-( ) and sda4 is swap. /sbin/init is 35k-0755-file and works fine with older kernels.

sda2 is detected as a reiserfs3.6 like it really is, it's mounted read-only. My kernel-parameters are only root=/dev/sda2.
Adding a init=<anything> brings an additional line of "cannot execute init, returning to defaults". After this line, the cannot execute of init follows and the system hangs.

Using similar .config files on my older notebook(T23, x86) as well as on my desktop (amd64 / athlon64, DualCore), works fine.
Comment 20 Daniel Drake (RETIRED) gentoo-dev 2009-06-21 14:58:44 UTC
Stephan, please file a separate bug for your issue unless the ideas below solve your problem.


Lóránt, in your config you have this:
 CONFIG_INITRAMFS_SOURCE="/usr/share/v86d/initramfs"
this means that your root= arguments are being ignored by the kernel and it is building an initramfs from that directory, which is supposed to be able to find the root partition and mount it.
My guess is that directory is empty or incomplete.

As a next step, please blank out that setting so that you boot without an initramfs. If that works then we can reassign to the v86d maintainer to figure out the initramfs problem.
Comment 21 Roman Krylov 2009-06-22 10:45:23 UTC
I have in my .config CONFIG_INITRAMFS_SOURCE="", but the same problem with one motherboard and no problems with another (like enforce4).
I think the problem is in sata driver for nvidia chip:
> 00:0e.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev a1)
> 00:0f.0 IDE interface: nVidia Corporation MCP51 Serial ATA Controller (rev a1)
Comment 22 Lóránt Farkas 2009-06-27 07:05:44 UTC
Created attachment 195850 [details]
working config for 2.6.29-gentoo-r5

There were nothing relevant in the log of non-working kernel. However at later the same kernel didn't give the failed init message but simply hanged up at the same place.

BUT, now I have a working new kernel. I have made this by 

make mrproper 

then

make defconfig

then I selected (with make menuconfig) the proper Processor type IDE/SATA chipset etc... Now everything is working. I didn't find any serious difference between the working and non-working config. Maybe it is a problem with the import of the old config (as I see most of the people here has tried to import their old config). 

I suggest everyone to try config his/her new kernel as above.
Comment 23 Daniel Drake (RETIRED) gentoo-dev 2009-06-27 08:17:23 UTC
I am still reasonably confident that it is a problem with your initramfs. If you rebuilt your working kernel using the same config right now, your newly rebuilt version would encounter the same problem. This is because once upon a time you had a usable initramfs in /usr/share/v86d/initramfs which was built into your kernel, but now for some reason you do not, so any kernel built at this point includes an unusable initramfs.

This is only an educated guess though, but I imagine your new kernel .config does not have CONFIG_INITRAMFS_SOURCE="/usr/share/v86d/initramfs" (and if it did then you would not be able to boot). Either way, if it's working now, I'll close this bug. I also do not recommend copying a .config between a big kernel change e.g. 2.6.29 to 2.6.30 because of the huge amount of change in each kernel release.
Comment 24 Daniel Drake (RETIRED) gentoo-dev 2009-06-27 08:18:25 UTC
Roman, you should file a separate bug.
Comment 25 Lóránt Farkas 2009-06-28 07:35:09 UTC
It has nothing to do with  CONFIG_INITRAMFS_SOURCE="/usr/share/v86d/initramfs". It is only needed for uvesafb. See http://dev.gentoo.org/~spock/projects/uvesafb/

My first kernel which started to work was a 2.6.30 kernel, without CONFIG_INITRAMFS_SOURCE="/usr/share/v86d/initramfs". An it worked (though without uvesafb). 

The problem was with the import of an old config.  I imported the config-2.6.25-gentoo-r5 manually (In the menuconfig of 2.6.29). I think this was a mistake. If you encounter this problem try configure as above. 
Comment 26 Daniel Drake (RETIRED) gentoo-dev 2009-06-28 09:47:10 UTC
Having that setting set to *anything* drastically changes the way the kernel boots, to the point where the kernel itself doesn't even look at your hard drives in order to try and find init. So it still seems very likely to me that this was the cause of your problem. If there is one small problem inside that directory then you'll see exactly this problem.