Hello, i have rarely used such high level reporting in Buzilla, but really if you remerge udev-090 you will not be able to reboot again without a liveCD or a distribution CD. So sorry if that does hurt any of your devs'feelings :) Indeed, a really nasty bug appears in udev-090 and udev-089. It affects the early boot process. So it is really difficult to diagnose because you cannot stop the kernel messages scrolling along your display. What is strange is that it has worked until now, that means, until i emerged the same version of the package again! The problem resides in udev-start.sh file. Facts: populate_udev() function calls udevtrigger since udev-098, instead of trigger_events() that is not supported any more since udev-090. But udevtrigger doesn't work for athlon xp (at least for my computer). Increasing the 300 loops doesn't help anymore. Possible solution: As workaround, you can add in /lib/rcscripts/addons/udev-start.sh the "old" udev-trigger() function already present in udev-089 package and call that function instead of udevtrigger in populate_udev() Hope that helps. Jj
I have the same problem on amd64.
(In reply to comment #1) > I have the same problem on amd64. > sorry for the late reply to make my athlon XP to boot again, i modify the udev-start.sh script with the following: diff -ruN /lib/rcscripts/addons/udev-start.sh /var/tmp/udev-start.sh --- /lib/rcscripts/addons/udev-start.sh 2006-04-17 21:30:21.000000000 +0200 +++ /var/tmp/udev-start.sh 2006-06-01 16:09:31.000000000 +0200 @@ -51,7 +51,7 @@ # populate /dev with devices already found by the kernel if [ "$(get_KV)" -gt "$(KV_to_int '2.6.14')" ] ; then ebegin "Populating /dev with existing devices through uevents" - udevtrigger + trigger_events eend 0 else ebegin "Populating /dev with existing devices with udevstart"
That means that some module is being automatically loaded for your machine that is causing it to lock up. Any chance you can modify the trigger_events function to handle all devices also (take the comment out of the line it says to) and see if it still locks up there? If so, can you add some "echos" to that function to narrow it down to what device is causing the problem?
(In reply to comment #3) > That means that some module is being automatically loaded for your > machine that is causing it to lock up. we should not define that as a "module lock up initialisation" :). The boot process doesn't stop in udev-start.sh script but rather after when trying to access to the filesystems. Because the kernel did not find anything in /dev the kernel panics. The problem is certainly with udevtrigger. It doesn't do its job well and do nothing when it is called in udev-start.sh script. That's why /dev/.udev stays empty. > > Any chance you can modify the trigger_events function to handle all devices > also (take the comment out of the line it says to) and see if it still > locks up there? trigger_events works well and populate /dev. So all the modules could be loaded. What bothers me is - apart the fact that two of my machines don't have the same proc (Athlon XP vs Amd64) they have the same boot config (root=/dev/ram0 init=/linuxrc real_root=/dev/evms/root doevms2) - the problem occurs only with an athlon XP. It will definitly not work with udevtrigger. > > If so, can you add some "echos" to that function to narrow it down to what > device is causing the problem? > The easiest way to read something on the display were to increased the loop size in populate_udev() and to add --verbose to udevtrigger. udevtrigger doesn't print anything and of course doesn't populate /dev. With the trigger_events script i didn't have such problem. PS: i would do some more test on the server but i had so many problems to boot it from a livecd. Only with a lot of luck i could access to the root filesystem and restore udev-start.sh. I'm not really hurry to start again ;) Thx Jj
Ah, so this might be an evms issue, not a udev one. You didn't mention that the boot process failed because the root partition was not found, that's very important. Are you using the genkernel package? What happens if you do not use any initramfs/initrd for your kernel?
(In reply to comment #5) > Ah, so this might be an evms issue, not a udev one. Really i don't believe that could be an evms issue. /dev devices are simply not created when udev-start.sh is called. Changing udevtrigger with trigger_events corrects the issue. As i mentionned above, i have both computers with exacly the same config and i encounter the issue only on the 32bit kernel version. The real difference is the kernel level; the athlon amd64 and the athlon xp are using respectively a 2.6.16-gentoo-r9 64bit kernel and a linux-2.6.17-rc5 vanilla 32bit kernel. Both are using acpi, sysfs, evms2, etc. > > You didn't mention that the boot process failed because the root partition > was not found, that's very important. > Yes, root is on an evms partition. So i need to load the kernel and evms2 module in memory to access root partition. Only /boot stays on an ext2 partition. Swap is on its own standalone evms container. I guess it is not necessary any longer. > Are you using the genkernel package? > Yes that makes more sense than doing all the job by hand ;). Really a great tool.I'm using sys-kernel/genkernel-3.3.11d. Works great. here is the command: genkernel --gensplash --gensplash-res=1024x768 --no-install --no-clean --evms2 --kerneldir=/usr/src/linux all the genkernel.conf is standard except for CACHE_DIR="/var/genkernel/pkg/%%ARCH%%" > What happens if you do not use any initramfs/initrd for your kernel? > Of course, it's impossible for the kernel to find the root partition ;) I cannot pass the first step when the modules are called in memory (/dev/ram). When the issue occurs the evms root partition is mounted. so it couldn't be related with the evms module since it is already loaded.
Hello, i'm not sure what really append but after remerging the whole world with gcc 4.1.1 instead gcc 4.2, the problem vanished. Probably the -03 flag or whatever gcc 4.2 optimizations could have corrupt the udevtrigger code. The problem has only affected the amd xp 32bit computer. My flag was: CFLAGS="-march=athlon-xp -O3 -pipe" Jj
Then this was a compiler issue. Closing bug.