Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 61562 - grub hangs with "GRUB" after updating to 0.95 - possibly bad link against libgtop 2.6?
Summary: grub hangs with "GRUB" after updating to 0.95 - possibly bad link against lib...
Status: RESOLVED TEST-REQUEST
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High major (vote)
Assignee: Robert Moss (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-08-24 14:16 UTC by Sven Bauknecht
Modified: 2004-09-10 10:19 UTC (History)
1 user (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
my grub.conf (grub.conf,947 bytes, text/plain)
2004-09-01 16:15 UTC, Geoff Leach
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sven Bauknecht 2004-08-24 14:16:33 UTC
after updating to grub 0.95 (today Aug 24th 2004) and updating gnome to 2.6.2 
(I don`t know if the gnome update does matter in this case but I mention it because that`s what I did)
I did a reboot. Then grub hung with GRUB and nothing else, means no further error message.

Sorry if this report isn`t state of the art but it`s my first - I try my best and will improve :)

Reproducible: Didn't try
Steps to Reproduce:
1.
2.
3.




I could fix it with booting from the gentoo live CD
chrooting to my system
emerging grub-0.94-r2 with the still living local ebuild in
/usr/portage/sys-boot/grub/grub
***there also libgtop was downgraded to 2.0.8 (if I remeber correctly but it was
a 2.0.x)***
reinstalling grub with
grub --no-floppy
root (you even know how that works)

Hope that helps

--- my make.conf --- if helpful ---

# Build-time functionality
# ========================
#
# The USE variable is used to enable optional build-time functionality. For
# example, quite a few packages have optional X, gtk or GNOME functionality
# that can only be enabled or disabled at compile-time. Gentoo Linux has a
# very extensive set of USE variables described in our USE variable HOWTO at
# http://www.gentoo.org/doc/use-howto.html
#
# The available list of use flags with descriptions is in your portage tree.
# Use 'less' to view them:  --> less /usr/portage/profiles/use.desc <--
#
# 'ufed' is an ncurses/dialog interface available in portage to make handling
# useflags for you. 'emerge app-admin/ufed'
#
# Example:
#USE="X gtk gnome -alsa"
LINGUAS="de en"
USE="divx4linux dvd nls xv bidi truetype wxwindows imlib matroska faad png dba
gd gd-external"


# Host Setting
# ============
#
# If you are using a Pentium Pro or greater processor, leave this line as-is;
# otherwise, change to i586, i486 or i386 as appropriate. All modern systems
# (even Athlons) should use "i686-pc-linux-gnu". All K6's are i586.
#
CHOST="i686-pc-linux-gnu"

# Host and optimization settings 
# ==============================
#
# For optimal performance, enable a CFLAGS setting appropriate for your CPU.
#
# Please note that if you experience strange issues with a package, it may be
# due to gcc's optimizations interacting in a strange way. Please test the
# package (and in some cases the libraries it uses) at default optimizations
# before reporting errors to developers.
#
# -mcpu=<cpu-type> means optimize code for the particular type of CPU without
# breaking compatibility with other CPUs.
#
# -march=<cpu-type> means to take full advantage of the ABI and instructions
# for the particular CPU; this will break compatibility with older CPUs (for
# example, -march=athlon-xp code will not run on a regular Athlon, and
# -march=i686 code will not run on a Pentium Classic.
#
# CPU types supported in gcc-3.2 and higher: athlon-xp, athlon-mp,
# athlon-tbird, athlon, k6, k6-2, k6-3, i386, i486, i586 (Pentium), i686
# (PentiumPro), pentium, pentium-mmx, pentiumpro, pentium2 (Celeron), pentium3.
# Note that Gentoo Linux 1.4 and higher include at least gcc-3.2.
# 
# CPU types supported in gcc-2.95*: k6, i386, i486, i586 (Pentium), i686
# (Pentium Pro), pentium, pentiumpro Gentoo Linux 1.2 and below use gcc-2.95*
Comment 1 Robert Moss (RETIRED) gentoo-dev 2004-08-24 23:29:44 UTC
Please post the output of both "emerge info" and "ldd /sbin/grub".
Comment 2 Michael Knight 2004-08-25 02:08:51 UTC
I also am experiencing this problem. I am currently using a Knoppix CD to chroot into my install, and thought I'd contribute before I downgrade GRUB.

Following is the output of `ldd /sbin/grub' and `emerge --info':

# ldd /sbin/grub
        linux-gate.so.1 =>  (0xffffe000)
        libncursesw.so.5 => /lib/libncursesw.so.5 (0x40034000)
        libc.so.6 => /lib/libc.so.6 (0x40082000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

# emerge --info
[] bash: /dev/null: Permission denied
[] bash: /dev/null: Permission denied
Portage 2.0.50-r10 (default-x86-1.4, gcc-3.3.4, glibc-2.3.4.20040808-r0, 2.6.7)
=================================================================
System uname: 2.6.7 i686
Gentoo Base System version 1.5.3
distcc 2.17 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled]
ccache version 2.3 [enabled]
Autoconf: sys-devel/autoconf-2.59-r4
Automake: sys-devel/automake-1.8.5-r1
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-march=athlon-xp -O3 -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-O2 -mcpu=i686 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox"
GENTOO_MIRRORS="http://mirror.internode.on.net/pub/gentoo http://mirror.pacific.net.au/gentoo http://gentoo.oregonstate.edu http://www.ibiblio.org/pub/Linux/distributions/gentoo"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://mirror.internode.on.net/gentoo-portage"
USE="3dnow 3dnow2 3dnowex X aalib alsa apm arts avi berkdb bonobo cdr crypt cups directfb emacs encode esd flac foomaticdb gdbm gif gnome gpm gtk gtk2 gtkhtml guile imap imlib java jpeg kde libg++ libwww mad mikmod mmx mmx2 motif mozilla moznocompose moznoirc mpeg nas ncurses nls nptl oggvorbis opengl oss pam pdflib perl png python qt quicktime readline samba sdl slang spell sse ssl svga tcltk tcpd tetex theora tiff truetype unicode x86 xml2 xmms xv xvid zlib"

I believe the permission-denied stuff with /dev/null is a Knoppix thing.
Comment 3 Sven Bauknecht 2004-08-25 09:27:56 UTC
emerge info is as follows:

AUTOCLEAN="yes"
CFLAGS="-mcpu=athlon-tbird -march=athlon-tbird -O3 -pipe -fomit-frame-pointer"
CHOST="i686-pc-linux-gnu"
COMPILER=""
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.1/share/config /usr/kde/3.2/share/config /usr/kde/3.3/share/config:/usr/kde/3.3/env:/usr/kde/3.3/shutdown /usr/kde/3/share/config /usr/lib/mozilla/defaults/pref /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/alias /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-O2 -mcpu=i686 -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache sandbox"
GENTOO_MIRRORS="http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ ftp://gd.tuwien.ac.at/opsys/linux/gentoo/ http://gd.tuwien.ac.at/opsys/linux/gentoo/ http://ftp.easynet.nl/mirror/gentoo/ ftp://ftp.easynet.nl/mirror/gentoo/ ftp://sunsite.informatik.rwth-aachen.de/pub/Linux/gentoo ftp://ftp.wh2.tu-dresden.de/pub/mirrors/gentoo http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ ftp://ftp.tu-clausthal.de/pub/linux/gentoo/ http://gentoo.zie.pg.gda.pl ftp://linux.rz.ruhr-uni-bochum.de/gentoo-mirror/ http://ftp.snt.utwente.nl/pub/os/linux/gentoo ftp://ftp.snt.utwente.nl/pub/os/linux/gentoo http://ftp.linux.ee/pub/gentoo/distfiles/ ftp://mir.zyrianes.net/gentoo/ ftp://ftp.linux.ee/pub/gentoo/distfiles/ http://ftp.heanet.ie/pub/gentoo/ http://src.gentoo.pl ftp://ftp.heanet.ie/pub/gentoo/ ftp://ftp.ntua.gr/pub/linux/gentoo/"
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY=""
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="X aalib alsa apm arts avi berkdb bidi cdr crypt cups dba divx4linux dvd encode esd faad foomaticdb gd gd-external gdbm gif gnomegpm gtk gtk2 imlib java jpeg kde libg++ libwww linguas_de linguas_en mad matroska mikmod motif mozilla mpeg mysql ncurses nls oggvorbis opengl oss pam pdflib perl png python qt quicktime readline sdl slang spell ssl svga tcltk tcpd tetex truetype wxwindows x86 xml2 xmms xv zlib"
---------------------------------------------------------------
ldd /sbin/grub
        libncurses.so.5 => /lib/libncurses.so.5 (0x40037000)
        libc.so.6 => /lib/libc.so.6 (0x4007c000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)
---------------------------------------------------------------
But this is the output of the reinstalled grub-0.94-r2

hope that helps...
Comment 4 Robert Moss (RETIRED) gentoo-dev 2004-08-25 12:49:26 UTC
Sven, I'm confused here. When you say you "updated to grub 0.95", did you simply do an "emerge -u grub", or did you do anything more than that? Theoretically, grub shouldn't be doing anything to /boot/grub if there's anything grub-related already.

I suspect that the problem may be stage1-related, in that it appears that you've managed to get a 0.95 stage 1 in conjunction with a 0.94 MBR. But then, that wouldn't explain Michael's problem. Whether Knoppix does something funny to the MBR I don't know. But anyway, Sven, a more full explanation of exactly what you did would be most helpful. By the way, if anyone else is having these problems, please say so, don't hide - grub problems are very tricky to diagnose due to the fact that debugging information tends not to exist.
Comment 5 Michael Knight 2004-08-26 03:14:08 UTC
Hi guys. I now have the problem fixed. I'm not sure how it got stuffed in the first place, but here's what I did:

1) Booted in the the Gentoo 2004.2 LiveCD with the SMP kernel (as it's a 2.6 one, and I would get 'FATAL: Kernel too old' errors if I used a 2.4) and downgraded GRUB.

2) This did the trick and I booted normally into my Gentoo install. From here I emerge the newer version of GRUB again.

3) After trying to re-install install GRUB onto the partition I was getting errors about /dev/hda7 not being in device.map, but fixed the problem as described at http://lists.debian.org/debian-user/2004/03/msg16261.html

4) I use NTLDR as I dual boot and chose to do it this way around, so I thought I'd update the boot image (or whatever it's called) on the Windows partition that it uses to access GRUB, via `dd if=/dev/hda7 count=1 bs=512 of=/mnt/fat32/linux.bin'.

5) After booting into Windows, overwriting the linux.bin file with the new one and restarting I was successfully able to access GRUB from NTLDR and boot into Gentoo.

One weird thing I noticed was that even thought I had installed the new GRUB onto the partition, the GRUB menu screen still said it was 0.94..? Anyway, HTH!
Comment 6 Robert Moss (RETIRED) gentoo-dev 2004-08-26 09:54:40 UTC
Right. I think I've got it. I suspect that you're using udev? Am I right? This affects what happens when you run grub itself.

We need to go back to using grub-install, I think. I need to go find myself a willing sacrificial lamb who just happens to be doing a reinstall... actually, it's probably time for a reinstall anyway, nothing's been broken here for at least a week ;-p

Anyway, thanks for the info. It appears that grub's handling of devices is now no longer quite so graceful as it should be. It's possible that I may just pull the bit out of the grub-install script that generates the device.map file.
Comment 7 Michael Knight 2004-08-26 15:54:46 UTC
Actually no, I'm not using udev :)
Comment 8 Robert Moss (RETIRED) gentoo-dev 2004-08-27 19:11:48 UTC
It turns out that udev is in fact irrelevant. grub-0.95 appears to be somewhat more strict with its treatment of devices. Sure, you can pass "/dev/hda3" on to the kernel as a root= option, as that starts devfsd, or udevd, but grub, in slimming down a bit for 0.95, now requires, in certain circumstances, a proper device.map as far as I can tell.

I think I might add two new local USE flags to grub - one to enable or disable (default) netboot stuff, and one to enable or disable (default) automatic update of /boot/grub/* and the MBR. Hopefully this should mean this gets fixed - I'll look into it tomorrow (it's 3:15am here...).
Comment 9 Michael Knight 2004-08-28 06:24:42 UTC
Sounds like a plan Rob. I'll be happy to test anything you need :)
Comment 10 Robert Moss (RETIRED) gentoo-dev 2004-08-28 06:48:41 UTC
Thanks very much! It's always nice to have a tester who isn't me...

I think I can actually get grub to auto-install itself on the partition it's already installed on. I'll put some legwork in on this one over the coming week as and when and if I get time (I'm supposed to be on holiday...).
Comment 11 Geoff Leach 2004-08-31 18:41:36 UTC
I think I had a similar problem ...

I just tried to reboot my machine after it locked up - first reboot
for about a week. Failed in grub with an error. Grub was reporting
0.95.  Basically grub couldnt find the boot partition nor any files
(kernels) in it, but I didnt note the specific grub error numbers
(there was a 15, not sure if there was a 12). But essentially grub
wouldnt let me set the root device nor the kernel using the 'root' and
'kernel' commands.  I ran setup in grub and that fixed things.

The grub-0.95 update occurred in the period from my last reboot.  It
was one of a number of packages (27) in an 'emerge -uD world'.  I
rebuild kernels fairly regularly (tracking the development-sources),
and I havent seen the problem before - so I dont think it was kernel
related, but grub related.

Before simply running setup in grub, I actually tried a few other
things including booting off another hard disk I keep with an older
kernel for special occasions :) and changing boot disk order in my
bios, plugging and unplugging disks.

I've just looked back through the log file for the emerge which
included the grub update (I redirect the output of all multi-package
emerges I do non-interactively, in case there are those little
messages which otherwise get missed). I emerged grub-0.95 on Aug
25. At the end of the ebuild for grub it says

 ^[[32;01m*^[[0m Your boot partition was not mounted as /boot, but portage
 ^[[32;01m*^[[0m was able to mount it without additional intervention.
 ^[[32;01m*^[[0m Files will be installed there for grub to function correctly.
 ^[[32;01m*^[[0m
 ^[[32;01m*^[[0m Linking from new grub.conf name to menu.lst
ln: `/boot/grub/menu.lst': File exists
 ^[[32;01m*^[[0m Copying files from /usr/lib/grub to /boot
cp: omitting directory `/usr/lib/grub/grub'

Just wondering if it is possible that I also had a 0.94 MBR with grub
0.95? 
Comment 12 Robert Moss (RETIRED) gentoo-dev 2004-09-01 10:08:54 UTC
Geoff - I think it is, yes. I don't know why this only happens in certain cases, but it seems to be the case that on some set-ups, the test to check whether or not files should be copied from /usr/lib/grub and /lib/grub isn't working. I'll see what I can do about that - right now this is broken and needs fixing.
Comment 13 Geoff Leach 2004-09-01 16:15:18 UTC
Created attachment 38715 [details]
my grub.conf

my grub.conf
Comment 14 Geoff Leach 2004-09-01 16:25:28 UTC
I just checked the dates of files in /boot/grub. Most of them are Aug 25, the day I emerged grub, so I think the files were copied ok. A diff of {/usr/lib/grub,/lib/grub} and /boot/grub shows all files are the same - except for /boot/stage2 although it has identical size and timestamp. Hmm. Dont get that.

I cant see anything in my log nor the ebuild (not that I'm an ebuild wizz) which would indicate the MBR being updated by issuing a setup command to/in grub.

I've just had a closer look at the ebuild. The final part of the ebuild has

[ -e /boot/grub/grub.conf ] \
        && /usr/sbin/grub \
            --batch \
            --device-map=/boot/grub/device.map \
            < /boot/grub/grub.conf > /dev/null 2>&1

but for me grub is in /sbin/grub not /usr/sbin/grub, and it seems that /sbin is where the ebuild puts grub.

When I run grub manually using the command 

/sbin/grub --batch --device-map=/boot/grub/device.map < /boot/grub/grub.conf

there doesnt appear to be any commands in my grub.conf which appear to update the MBR - if a MBR/grub version mismatch was indeed the problem. 

My grub.conf (see attachments) originally came from a redhat installation (anaconda) many moons ago. It has a list of kernels, memtest86 (essential for gentoo :)) and a windows boot option (ignore, to be fixed). I see that the sample grub.conf does have a setup command.
  
So I guess now that to recreate what happened I would need to back down on grub and iterate ... maybe later.

I should also say that I am using a mixture of sata and pata drives. My device.map seems ok

 # this device map was generated by anaconda
(fd0)     /dev/fd0
(hd0)     /dev/sda
#(hd2)     /dev/hda

Again originally from my redhat days. I cant remember editing it, but the /dev/sda entry is correct, and post-redhat. 
Comment 15 Robert Moss (RETIRED) gentoo-dev 2004-09-02 17:03:21 UTC
Oh, bugger. Thanks for pointing that one out. It *used* to live in /usr/sbin/grub when grub was in its 0.92 days. I hadn't realised that, I just glossed over it. I'll check to make sure that grub is in fact at /sbin/grub for all versions I can build on here and fix that. But thanks for pointing it out, that should fix this problem... (I hope!).
Comment 16 Robert Moss (RETIRED) gentoo-dev 2004-09-08 23:27:47 UTC
Please test. A remerge of grub may well be required. After talking to the guys in #grub it appears that device.map must be valid, which should now be the case.
Comment 17 Geoff Leach 2004-09-10 06:29:43 UTC
Does the MBR version always need to match the grub version? I'm wondering becase my grub.conf doesnt have any 'setup' commands which would effect that, and I think the point of the last part of the ebuild script is to run grub with 'grub.conf' as input so as to overwrite the MBR? 
Comment 18 Robert Moss (RETIRED) gentoo-dev 2004-09-10 10:18:36 UTC
The ebuild doesn't touch the MBR. I'm going to add an ewarn once I fully understand what's going on, but here's the simple version I've got so far. The version in the MBR must use the same API/ABI/whatever as the version in /boot/grub. New stuff in /boot/grub isn't always backwards compatible. So, after each update of grub, you should really be rerunning the "grub; root (hd0,0); setup (hd0)" thing (obviously tailored for your system) to match the versions up again.
Comment 19 Robert Moss (RETIRED) gentoo-dev 2004-09-10 10:19:13 UTC
Oh - and that bit in the ebuild just makes sure that grub and the kernel are on speaking terms when the "root=" option is passed to it, that's all - nothing to do with the MBR.