I'm on stable x86. grub-0.97-r10 segfaults on pkg_postinst phase. bug #279536 seems similar. > * Copying files from /lib/grub, /usr/lib/grub and /usr/share/grub to //boot/grub /var/tmp/portage/sys-boot/grub-0.97-r10/temp/environment: line 4077: 17476 Done egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "${grub_config}" 17477 Segmentation fault | /sbin/grub --batch --device-map="${dir}"/device.map > /dev/null Reproducible: Always # emerge --info grub Portage 2.1.8.3 (default/linux/x86/10.0/desktop/gnome, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-gentoo-r1 i686) ================================================================= System Settings ================================================================= System uname: Linux-2.6.34-gentoo-r1-i686-Intel-R-_Core-TM-2_CPU_E7400_@_2.80GHz-with-gentoo-1.12.13 Timestamp of tree: Fri, 23 Jul 2010 07:45:02 +0000 ccache version 2.4 [disabled] app-shells/bash: 4.0_p37 dev-java/java-config: 2.1.11 dev-lang/python: 2.6.5-r2, 3.1.2-r3 dev-util/ccache: 2.4-r7 dev-util/cmake: 2.6.4-r3 sys-apps/baselayout: 1.12.13 sys-apps/sandbox: 1.6-r2 sys-devel/autoconf: 2.13, 2.65 sys-devel/automake: 1.9.6-r3, 1.10.3, 1.11.1 sys-devel/binutils: 2.20.1-r1 sys-devel/gcc: 4.4.3-r2 sys-devel/gcc-config: 1.4.1 sys-devel/libtool: 2.2.6b virtual/os-headers: 2.6.30-r1 ACCEPT_KEYWORDS="x86" ACCEPT_LICENSE="* -@EULA" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=i686 -pipe -mmmx -msse" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/X11/xkb" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=i686 -pipe -mmmx -msse" DISTDIR="/portage/distfiles" EMERGE_DEFAULT_OPTS="--with-bdeps=y" FEATURES="assume-digests buildpkg distlocks fixpackages keeptemp keepwork news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync" GENTOO_MIRRORS="ftp://mirror.ovh.net/gentoo-distfiles/ ftp://ftp.free.fr/mirrors/ftp.gentoo.org/ " LANG="fr_FR.utf8" LDFLAGS="-Wl,-O1,--hash-style=gnu,--sort-common -Wl,--as-needed" LINGUAS="fr" MAKEOPTS="-j5" PKGDIR="/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/portage/trees/gentoo" PORTDIR_OVERLAY="/portage/trees/perso /portage/trees/tempo" SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage" USE="X a52 aac acl acpi alsa berkdb branding bzip2 cairo cdr cli consolekit cracklib cups cxx dbus dri dts dvd dvdr emboss encode evo exif fam ffmpeg firefox flac fortran gdbm gdu gif gnome gnome-keyring gnutls gstreamer gtk hal iconv java jpeg lcms libnotify mad mikmod mmx mng modules mp3 mp4 mpeg mudflap nautilus ncurses network-cron nls nptl nptlonly ogg openal opengl openmp oss pam pango pcre pdf perl png policykit ppds pppd python qt3support qt4 readline reflection sdl session spell spl sse ssl startup-notification svg sysfs tcpd tiff truetype unicode usb vorbis x264 x86 xcb xml xorg xulrunner xv xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="fr" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia nv" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS ================================================================= Package Settings ================================================================= sys-boot/grub-0.97-r10 was built with the following: USE="ncurses -custom-cflags -netboot -static" CFLAGS=""
Created attachment 239913 [details] build log LANG=C emerge grub build log.
Created attachment 239915 [details] environment environment
Does it segfault if you run it (i.e. just "grub" as root on a command line) or only while emerge does postinstall? Would you also mind try moving your /boot/grub/device.map and re-emerge grub? Oh, and bug #279536 is not similar, that kind of segfault cannot occure to you in gentoo unless you are using hardened or you have messed with unsupported stuff.
(In reply to comment #3) > Does it segfault if you run it (i.e. just "grub" as root on a command line) or > only while emerge does postinstall? It segfaults only while emerge does postinstall. > Would you also mind try moving your /boot/grub/device.map and re-emerge grub? Tried it, same segfault.
I'm having the same issue across 6 of my 7 x86 servers. This is easily reproduced by calling: mount /boot; emerge --config =sys-boot/grub-0.97-r10 Output, and my option, are: Configuring pkg... * Enter the directory where you want to setup grub: /boot * *** IMPORTANT NOTE: you must run grub and install * the new version's stage1 to your MBR. Until you do, * stage1 and stage2 will still be the old version, but * later stages will be the new version, which could * cause problems such as an unbootable system. * This means you must use either grub-install or perform * root/setup manually! For more help, see the handbook: * http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=10#grub-install-auto * Copying files from /lib/grub, /usr/lib/grub and /usr/share/grub to /boot/grub /var/tmp/portage/sys-boot/grub-0.97-r10/temp/environment: line 4058: 25772 Done egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "${grub_config}" 25773 Segmentation fault | /sbin/grub --batch --device-map="${dir}"/device.map > /dev/null * Grub has been installed to /boot successfully.
Do you mind see if the following command crashes grub when you run it straightly in a terminal? egrep \ -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' \ "${grub_config}" | \ /sbin/grub --batch \ --device-map="${dir}"/device.map If so do you mind give what just the following part gives you for kind of output: egrep \ -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' \ "${grub_config}"
Created attachment 239983 [details] running same commands as root I had already tried to run these commands myself, and no, I do not have a crash when I run them as root. But running them through emerge produce a segfault.
(In reply to comment #7) > Created an attachment (id=239983) [details] > running same commands as root > > I had already tried to run these commands myself, and no, I do not have a crash > when I run them as root. But running them through emerge produce a segfault. > Have you tried with your /boot/grub/grub.conf?
Created attachment 239995 [details] egrep result on /boot/grub/grub.conf (In reply to comment #8) > Have you tried with your /boot/grub/grub.conf? It segfaults with it, but anyway I don't see any reference to this file in the ebuild ?
(In reply to comment #9) > but anyway I don't see any reference to this file in the ebuild ? > In fact yes, I see it now...
My guess is that the grep is supposed to single out some element from the grub.conf. However it fails and sends stuff to grub that grub is not capable of handling. This leads to grub crashing. So it would be nice to have the output from the grep, and also a backtrace of grub crashing while trying to parse what grep is sending it.
See attachment #239995 [details] from comment #9. It segfaults after the first kernel line.
Created attachment 240001 [details] grub segfaults when running egrep output
@reporter: could you post the whole grub.conf for us please? @maintainers: What is that part of the ebuild supposed to do? My guessing is that it tries to guess a couple of partitions and installing grub on them. (i.e. a "root (bla) && setup" approach). However currently it seems like it is trying to run linux inside of itself, and since linux already is booting it crashes down on us. I have seen it before on my systems before I started with grub2, so this is a old issue.
@reporter: also could you post a backtrace from grub? The following instructions should be sufficient: http://www.gentoo.org/proj/en/qa/backtraces.xml
Created attachment 240015 [details] grub.conf
Created attachment 240017 [details] backtrace.log from gdb I can reproduce the segfault by typing the following commands by hand in grub : > root (hd0,0) > kernel /boot/kernel-2.6.34-gentoo-r1 root=/dev/sda1 video=vesafb:mtrr:3,ywrap vga=792 So the problem does not come from the grub.conf itself.
Nope the problem comes from grub trying to execute the kernel inside of itself, and for some reason that fails. I cannot reproduce, and grub should be able to handle, so this might be an issue with something your kernel does to it. I just do not know what it could be.
Created attachment 240025 [details] config file for gentoo-sources-2.6.34-r1 Ok. Strange. Long time that I run with this kernel config. Generally I upgrade from kernels to kernels with make oldconfig. Here is my current config file. Maybe you can build a kernel and try to crash grub with it and the above commands. Time to stabilize grub-2 ? :p
All of my kernels are 2.6.32-r7, as I am waiting for more revisions of 2.6.34-r1 to come through the pipeline. I build all my kernels from scratch each time, and don't use the oldconfig to make them. I do not use genkernel. Since these are servers, the hardware is mostly similar, with varying processors among them. The only thing that I can note is that I use UUIDs in my /etc/fstab, but I don't know if that's even considered by grub since I'm not using UUIDs in the configuration there (could never get that to work with 0.97). The grub configuration is minimalistic at best, with this as an example: default saved timeout 15 fallback 2 title Gentoo Linux 2.6.32-r7 root (hd0,0) kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 panic=30 title Gentoo Linux 2.6.32-r7 (rescue) root (hd0,0) kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 init=/bin/bb panic=30 title Gentoo Linux 2.6.31-r10 root (hd0,0) kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1 title Gentoo Linux 2.6.31-r10 (rescue) root (hd0,0) kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1 init=/bin/bb What gets me is that this starts happening for my systems after a -r9 to -r10 update. What exactly changed between the two, and is there a way to see those changes somewhere documented?
(In reply to comment #6) > Do you mind see if the following command crashes grub when you run it > straightly in a terminal? > > egrep \ > -v > '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' > \ > "${grub_config}" | \ > /sbin/grub --batch \ > --device-map="${dir}"/device.map > > > If so do you mind give what just the following part gives you for kind of > output: > > egrep \ > -v > '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' > \ > "${grub_config}" > The combination of piping in the egrep into grub causes the segfault for me. My egrep output is: # egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "/boot/grub/menu.lst" root (hd0,0) kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 panic=30 root (hd0,0) kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 init=/bin/bb panic=30 root (hd0,0) kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1 root (hd0,0) kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1 init=/bin/bb I could not figure out how to pipe the data from egrep into grub via gdb. However, the direct output of it without gdb was: # egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "/boot/grub/menu.lst" | /sbin/grub --batch --device-map=/boot/grub/device.map GNU GRUB version 0.97 (640K lower / 3072K upper memory) [ Minimal BASH-like line editing is supported. For the first word, TAB lists possible command completions. Anywhere else TAB lists the possible completions of a device/filename. ] grub> root (hd0,0) Filesystem type is ext2fs, partition type 0x83 grub> kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 panic=30 [Linux-bzImage, setup=0x2e00, size=0x404a80] Segmentation fault
Does this bug even affect anything? I cleared out my /boot/grub directory (saved the grub.conf first though). Reinstalled grub 0.97-r10, ran the emerge --config =grub-0.97-r10 which ran without issue, and then did grub-install --no-floppy /dev/sda. After rebooting, it worked as expected. I also noticed that if I went backwards, to 0.97-r9, and did the emerge --config =grub-0.97-r9 it also segfaulted the same way. Since I've always been at -r9 since I went to grub from lilo, maybe this is just a non-issue.
Problem solved. The answer is in the src_unpack() function of the ebuild : > # Grub will not handle a kernel larger than EXTENDED_MEMSIZE Mb as > # discovered in bug 160801. We can change this, however, using larger values > # for this variable means that Grub needs more memory to run and boot. For a > # kernel of size N, Grub needs (N+1)*2. Advanced users should set a custom > # value in make.conf, it is possible to make kernels ~16Mb in size, but it > # needs the kitchen sink built-in. > local t="custom" > if [[ -z ${GRUB_MAX_KERNEL_SIZE} ]] ; then > case $(tc-arch) in > amd64) GRUB_MAX_KERNEL_SIZE=7 ;; > x86) GRUB_MAX_KERNEL_SIZE=3 ;; > esac > t="default" > fi > $ ls -lh /boot | grep 'kernel' > -rw-r--r-- 1 root root 3,1M 28 juin 11:14 kernel-2.6.31-gentoo-r10 > -rw-r--r-- 1 root root 3,3M 17 juil. 12:22 kernel-2.6.34-gentoo-r1 > $ tail -n 3 /etc/make.conf > > GRUB_MAX_KERNEL_SIZE=4 And the crash disappeared.
In my make.conf I have these variables set : > PORTAGE_ELOG_CLASSES="log warn error qa" > PORTAGE_ELOG_SYSTEM="mail mail_summary save_summary" I don't want the 'info' class because most of the time my mailbox is spammed with epatch messages. Portage should handle 2 levels of einfo messages : - first level pour internal emerge things (* Running eautoreconf in foo, * Applying bar.patch, ...) - second level for informations that developers wants to communicate Anyway, maybe you should ewarn this explanation about kernel size that grub can handle. Recent kernels grow up.
I've raised the default GRUB_MAX_KERNEL_SIZE value some more. A word of warning to anybody trying to use it on low-memory systems however, if you raise it too far, GRUB will refuse to run.
(In reply to comment #23) > Problem solved. The answer is in the src_unpack() function of the ebuild : > > > # Grub will not handle a kernel larger than EXTENDED_MEMSIZE Mb as > > # discovered in bug 160801. We can change this, however, using larger values > > # for this variable means that Grub needs more memory to run and boot. For a > > # kernel of size N, Grub needs (N+1)*2. Advanced users should set a custom > > # value in make.conf, it is possible to make kernels ~16Mb in size, but it > > # needs the kitchen sink built-in. > > local t="custom" > > if [[ -z ${GRUB_MAX_KERNEL_SIZE} ]] ; then > > case $(tc-arch) in > > amd64) GRUB_MAX_KERNEL_SIZE=7 ;; > > x86) GRUB_MAX_KERNEL_SIZE=3 ;; > > esac > > t="default" > > fi > > > > > $ ls -lh /boot | grep 'kernel' > > -rw-r--r-- 1 root root 3,1M 28 juin 11:14 kernel-2.6.31-gentoo-r10 > > -rw-r--r-- 1 root root 3,3M 17 juil. 12:22 kernel-2.6.34-gentoo-r1 > > > > $ tail -n 3 /etc/make.conf > > > > GRUB_MAX_KERNEL_SIZE=4 > > And the crash disappeared. > I added GRUB_MAX_KERNEL_SIZE=5 to my make.conf file and it didn't seem to make any sort of difference. Is there something special that needs to be done for make.conf to recognize the increased value?
randalla: The new defaults are x86=5, amd64=9. Emerge --sync, then recompile grub, and try again. If it still persists, include your emerge --info output here.