Summary: | gentoo-sources-2.6.32 stops at "Waiting for uevents to be processed" | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Farid <djfarid> |
Component: | [OLD] Core system | Assignee: | Gentoo Kernel Bug Wranglers and Kernel Maintainers <kernel> |
Status: | RESOLVED TEST-REQUEST | ||
Severity: | critical | CC: | letharion, r.deepak.ram, spock, xmw |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | AMD64 | ||
OS: | Linux | ||
Whiteboard: | linux-2.6.32 | ||
Package list: | Runtime testing required: | --- | |
Attachments: |
config file for the working gentoo-sources-2.6.31-r6
config file for the not working gentoo-sources-2.6.32-r1 Working minimal config for 2.6.23-r1 |
Description
Farid
2010-01-01 23:18:27 UTC
Created attachment 214888 [details]
config file for the working gentoo-sources-2.6.31-r6
Created attachment 214889 [details]
config file for the not working gentoo-sources-2.6.32-r1
Since that message indicates a pause waiting for hardware to be detected and generate udev events, it seems likely to be a regression in one of the hardware drivers that is causing the lockup. I saw one instance where it was a v4l2 driver that started causing lockups, but it would have to be something else in your case since you don't have that stuff enabled. How about doing a test with a minimal kernel where you turn off most hardware except essentials (hard drive controller, framebuffer, USB input) and see if boot still gets stuck there? (In reply to comment #3) Seems that you were right. I did a very minimal kernel. I removed all the drivers except for the harddrive. That kernel works, or at least it does not stop responding. Do you have an idea of which driver it could be? I will take the original working config, and exclude one driver at the time. But the system works as a headless server. It's a little bit of a pain to do these tests. I will give a report of which driver it is that causes this, whenever I have had the time to test. Created attachment 215084 [details]
Working minimal config for 2.6.23-r1
Ok, how's this for a strategy: 1. while booted to your 2.6.31 kernel, save output of 'lsmod' to see which drivers get automatically loaded for your hardware 2. add enough drivers to your minimal 2.6.32 kernel to not be headless 3. if that boots, then put back all the options other than modules to match your non-working config 4. if that boots too, start checking the remaining modules * while booted to your new 2.6.32, use menuconfig to select the drivers from your saved list *'make modules' * manually load each of the modules in your list until you find one that hangs on load I had some fun compiling, changing and compiling for three hours... I pinned it down to the frame buffer driver viafb. I have a working 2.6.32-r1 kernel that is running now. (I am not using frame buffer anyway, I don't know why I had it enabled in the first place). Now what? Is there anything more I can do? Would you be game for reporting this kernel bug upstream? It would entail building a vanilla kernel to check that the viafb module still breaks your system, and posting 'lspci' hardware info for your VIA system. I emerged sys-kernel/vanilla-sources-2.6.33-rc3. I compiled it without and with the viafb. Both kernels worked. This must mean that the problem is either fixed in 2.6.33-rc3, or that the problem is caused by the gentoo patches. I'll do the same test with vanilla-sources-2.6.32.3 later just for the sake of it. I emerged sys-kernel/vanilla-sources-2.6.32.3. Compiled it with the viafb. That kernel works without problems... You're right, it must be a patch that caused the problem. The top suspect is fbcondecor patch, since it changes framebuffer code. You can extract that patch from your genpatches tarball if you haven't cleaned out your distfiles lately, or else download it separately from here: http://sources.gentoo.org/viewcvs.py/linux-patches/genpatches-2.6/tags/2.6.32-1/ What happens if you apply fbcondecor patch to vanilla kernel, does it break? If so, we can report this bug to fbcondecor. I just attempted to compile 2.6.33-rc5 vanilla after applying the latest fbcondecor patch. It did indeed break the compile with: ________________________________________________________________________________ CC kernel/sysctl.o kernel/sysctl.c:240: error: unknown field 'ctl_name' specified in initializer kernel/sysctl.c:240: error: 'CTL_UNNUMBERED' undeclared here (not in a function) kernel/sysctl.c:246: error: unknown field 'strategy' specified in initializer kernel/sysctl.c:246: error: 'sysctl_string' undeclared here (not in a function) make[1]: *** [kernel/sysctl.o] Error 1 make: *** [kernel] Error 2 ________________________________________________________________________________ The line in question appears to be kernel/sysctl.c ln 240: ________________________________________________________________________________ #ifdef CONFIG_FB_CON_DECOR { .ctl_name = CTL_UNNUMBERED, .procname = "fbcondecor", .data = &fbcon_decor_path, .maxlen = KMOD_PATH_LEN, .mode = 0644, .proc_handler = &proc_dostring, .strategy = &sysctl_string, }, #endif ________________________________________________________________________________ (In reply to comment #11) > You're right, it must be a patch that caused the problem. > > The top suspect is fbcondecor patch, since it changes framebuffer code. > > You can extract that patch from your genpatches tarball if you haven't cleaned > out your distfiles lately, or else download it separately from here: > http://sources.gentoo.org/viewcvs.py/linux-patches/genpatches-2.6/tags/2.6.32-1/ > > What happens if you apply fbcondecor patch to vanilla kernel, does it break? If > so, we can report this bug to fbcondecor. > Spock wrote a new patch for 2.6.33 which is now in gentoo-sources-2.6.33, can you just confirm everything works ok? I have the same issue and it continues in 2.6.33 too. acpi=off fixes the problem. I know this sound really wrong/wired/funny, but press one/some key on the keyboard. My systems sometimes get stuck during bootup as yours and will continue after (keyboard) input event (usb mouse not loaded jet). Could be a empty random pool on /dev/random or something. I have the same problem with gentoo-sources-2.6.32-r7 on my Virtualbox guest. It works fine with gentoo-sources-2.6.31-r10. With rc_coldplug="NO" the boot continues but hangs then the network driver "pcnet32" is loaded. Please test with vanilla-sources-2.6.34 and let's get this upstream if this is still an issue. *** Bug 332267 has been marked as a duplicate of this bug. *** I have the same problem. For me the solution was to uninstall the nvidia drivers and remove the ko file. Just removing the file would probably have been enough. To the best of my ability I tried stripping my kernel down to the basic needs for harddrives, network and input, but the problem remains if nvidia-drivers is installed. Can I prevent the nvidia-drivers from being autoloaded, and only load on need? Switching from gentoo-sources-2.6.34-r1 to vanilla-sources-2.6.34 makes no difference for me. If I can help anyone else out by testing other things, let me know. I also tried going back to gentoo-sources-2.6.31-10, which makes the issue go away. |