Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 329569 - sys-boot/grub-0.97-r10 segfaults on pkg_postinst phase when GRUB_MAX_KERNEL_SIZE is too small
Summary: sys-boot/grub-0.97-r10 segfaults on pkg_postinst phase when GRUB_MAX_KERNEL_S...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 329325
  Show dependency tree
 
Reported: 2010-07-23 12:20 UTC by Fab
Modified: 2010-07-30 20:26 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
build log (build.log,147.17 KB, text/plain)
2010-07-23 12:23 UTC, Fab
Details
environment (environment,154.58 KB, text/plain)
2010-07-23 12:24 UTC, Fab
Details
running same commands as root (running_commands,565 bytes, text/plain)
2010-07-24 09:33 UTC, Fab
Details
egrep result on /boot/grub/grub.conf (egrep_result,341 bytes, text/plain)
2010-07-24 11:05 UTC, Fab
Details
grub segfaults when running egrep output (grub_segfault,678 bytes, text/plain)
2010-07-24 12:41 UTC, Fab
Details
grub.conf (grub.conf,831 bytes, text/plain)
2010-07-24 17:27 UTC, Fab
Details
backtrace.log from gdb (backtrace.log,1.31 KB, text/plain)
2010-07-24 17:29 UTC, Fab
Details
config file for gentoo-sources-2.6.34-r1 (config-2.6.34-gentoo-r1,54.95 KB, text/plain)
2010-07-24 18:28 UTC, Fab
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Fab 2010-07-23 12:20:53 UTC
I'm on stable x86. grub-0.97-r10 segfaults on pkg_postinst phase. bug #279536 seems similar.

>  * Copying files from /lib/grub, /usr/lib/grub and /usr/share/grub to //boot/grub
/var/tmp/portage/sys-boot/grub-0.97-r10/temp/environment: line 4077: 17476 Done                    egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "${grub_config}"
     17477 Segmentation fault      | /sbin/grub --batch --device-map="${dir}"/device.map > /dev/null



Reproducible: Always




# emerge --info grub
Portage 2.1.8.3 (default/linux/x86/10.0/desktop/gnome, gcc-4.4.3, glibc-2.11.2-r0, 2.6.34-gentoo-r1 i686)
=================================================================
                        System Settings
=================================================================
System uname: Linux-2.6.34-gentoo-r1-i686-Intel-R-_Core-TM-2_CPU_E7400_@_2.80GHz-with-gentoo-1.12.13
Timestamp of tree: Fri, 23 Jul 2010 07:45:02 +0000
ccache version 2.4 [disabled]
app-shells/bash:     4.0_p37
dev-java/java-config: 2.1.11
dev-lang/python:     2.6.5-r2, 3.1.2-r3
dev-util/ccache:     2.4-r7
dev-util/cmake:      2.6.4-r3
sys-apps/baselayout: 1.12.13
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.65
sys-devel/automake:  1.9.6-r3, 1.10.3, 1.11.1
sys-devel/binutils:  2.20.1-r1
sys-devel/gcc:       4.4.3-r2
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   2.2.6b
virtual/os-headers:  2.6.30-r1
ACCEPT_KEYWORDS="x86"
ACCEPT_LICENSE="* -@EULA"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=i686 -pipe -mmmx -msse"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/X11/xkb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=i686 -pipe -mmmx -msse"
DISTDIR="/portage/distfiles"
EMERGE_DEFAULT_OPTS="--with-bdeps=y"
FEATURES="assume-digests buildpkg distlocks fixpackages keeptemp keepwork news parallel-fetch protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"
GENTOO_MIRRORS="ftp://mirror.ovh.net/gentoo-distfiles/ ftp://ftp.free.fr/mirrors/ftp.gentoo.org/ "
LANG="fr_FR.utf8"
LDFLAGS="-Wl,-O1,--hash-style=gnu,--sort-common -Wl,--as-needed"
LINGUAS="fr"
MAKEOPTS="-j5"
PKGDIR="/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/portage/trees/gentoo"
PORTDIR_OVERLAY="/portage/trees/perso /portage/trees/tempo"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa berkdb branding bzip2 cairo cdr cli consolekit cracklib cups cxx dbus dri dts dvd dvdr emboss encode evo exif fam ffmpeg firefox flac fortran gdbm gdu gif gnome gnome-keyring gnutls gstreamer gtk hal iconv java jpeg lcms libnotify mad mikmod mmx mng modules mp3 mp4 mpeg mudflap nautilus ncurses network-cron nls nptl nptlonly ogg openal opengl openmp oss pam pango pcre pdf perl png policykit ppds pppd python qt3support qt4 readline reflection sdl session spell spl sse ssl startup-notification svg sysfs tcpd tiff truetype unicode usb vorbis x264 x86 xcb xml xorg xulrunner xv xvid zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="fr" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="nvidia nv" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" 
Unset:  CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

=================================================================
                        Package Settings
=================================================================

sys-boot/grub-0.97-r10 was built with the following:
USE="ncurses -custom-cflags -netboot -static" 
CFLAGS=""
Comment 1 Fab 2010-07-23 12:23:13 UTC
Created attachment 239913 [details]
build log

LANG=C emerge grub build log.
Comment 2 Fab 2010-07-23 12:24:32 UTC
Created attachment 239915 [details]
environment

environment
Comment 3 Xake 2010-07-23 14:48:37 UTC
Does it segfault if you run it (i.e. just "grub" as root on a command line) or only while emerge does postinstall?

Would you also mind try moving your /boot/grub/device.map and re-emerge grub?

Oh, and bug #279536 is not similar, that kind of segfault cannot occure to you in gentoo unless you are using hardened or you have messed with unsupported stuff.
Comment 4 Fab 2010-07-23 15:57:38 UTC
(In reply to comment #3)
> Does it segfault if you run it (i.e. just "grub" as root on a command line) or
> only while emerge does postinstall?

It segfaults only while emerge does postinstall.

> Would you also mind try moving your /boot/grub/device.map and re-emerge grub?

Tried it, same segfault.
Comment 5 Adam Randall 2010-07-24 01:32:03 UTC
I'm having the same issue across 6 of my 7 x86 servers. This is easily reproduced by calling:

mount /boot; emerge --config =sys-boot/grub-0.97-r10

Output, and my option, are:

Configuring pkg...

 * Enter the directory where you want to setup grub:
/boot
 * *** IMPORTANT NOTE: you must run grub and install
 * the new version's stage1 to your MBR.  Until you do,
 * stage1 and stage2 will still be the old version, but
 * later stages will be the new version, which could
 * cause problems such as an unbootable system.
 * This means you must use either grub-install or perform
 * root/setup manually! For more help, see the handbook:
 * http://www.gentoo.org/doc/en/handbook/handbook-x86.xml?part=1&chap=10#grub-install-auto
 * Copying files from /lib/grub, /usr/lib/grub and /usr/share/grub to /boot/grub
/var/tmp/portage/sys-boot/grub-0.97-r10/temp/environment: line 4058: 25772 Done                    egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "${grub_config}"
     25773 Segmentation fault      | /sbin/grub --batch --device-map="${dir}"/device.map > /dev/null
 * Grub has been installed to /boot successfully.
Comment 6 Xake 2010-07-24 08:16:27 UTC
Do you mind see if the following command crashes grub when you run it straightly in a terminal?

egrep \
  -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' \
  "${grub_config}" | \
/sbin/grub --batch \
  --device-map="${dir}"/device.map


If so do you mind give what just the following part gives you for kind of output:

egrep \
  -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' \
  "${grub_config}"
Comment 7 Fab 2010-07-24 09:33:40 UTC
Created attachment 239983 [details]
running same commands as root

I had already tried to run these commands myself, and no, I do not have a crash when I run them as root. But running them through emerge produce a segfault.
Comment 8 Xake 2010-07-24 10:10:46 UTC
(In reply to comment #7)
> Created an attachment (id=239983) [details]
> running same commands as root
> 
> I had already tried to run these commands myself, and no, I do not have a crash
> when I run them as root. But running them through emerge produce a segfault.
> 

Have you tried with your /boot/grub/grub.conf?
Comment 9 Fab 2010-07-24 11:05:59 UTC
Created attachment 239995 [details]
egrep result on /boot/grub/grub.conf

(In reply to comment #8)
> Have you tried with your /boot/grub/grub.conf?

It segfaults with it, but anyway I don't see any reference to this file in the ebuild ?
Comment 10 Fab 2010-07-24 11:09:48 UTC
(In reply to comment #9)
> but anyway I don't see any reference to this file in the ebuild ?
> 

In fact yes, I see it now...
Comment 11 Xake 2010-07-24 12:27:04 UTC
My guess is that the grep is supposed to single out some element from the grub.conf. However it fails and sends stuff to grub that grub is not capable of handling. This leads to grub crashing. So it would be nice to have the output from the grep, and also a backtrace of grub crashing while trying to parse what grep is sending it.
Comment 12 Fab 2010-07-24 12:34:43 UTC
See attachment #239995 [details] from comment #9.
It segfaults after the first kernel line.
Comment 13 Fab 2010-07-24 12:41:49 UTC
Created attachment 240001 [details]
grub segfaults when running egrep output
Comment 14 Xake 2010-07-24 16:10:50 UTC
@reporter: could you post the whole grub.conf for us please?


@maintainers:

What is that part of the ebuild supposed to do? My guessing is that it tries to guess a couple of partitions and installing grub on them. (i.e. a "root (bla) && setup" approach).
However currently it seems like it is trying to run linux inside of itself, and since linux already is booting it crashes down on us.

I have seen it before on my systems before I started with grub2, so this is a old issue.
Comment 15 Xake 2010-07-24 16:39:56 UTC
@reporter: also could you post a backtrace from grub?
The following instructions should be sufficient:
http://www.gentoo.org/proj/en/qa/backtraces.xml

Comment 16 Fab 2010-07-24 17:27:38 UTC
Created attachment 240015 [details]
grub.conf
Comment 17 Fab 2010-07-24 17:29:39 UTC
Created attachment 240017 [details]
backtrace.log from gdb

I can reproduce the segfault by typing the following commands by hand in grub :

> root (hd0,0)
> kernel /boot/kernel-2.6.34-gentoo-r1 root=/dev/sda1 video=vesafb:mtrr:3,ywrap vga=792

So the problem does not come from the grub.conf itself.
Comment 18 Xake 2010-07-24 18:01:59 UTC
Nope the problem comes from grub trying to execute the kernel inside of itself, and for some reason that fails. I cannot reproduce, and grub should be able to handle, so this might be an issue with something your kernel does to it. I just do not know what it could be.
Comment 19 Fab 2010-07-24 18:28:13 UTC
Created attachment 240025 [details]
config file for gentoo-sources-2.6.34-r1

Ok. Strange. Long time that I run with this kernel config. Generally I upgrade from kernels to kernels with make oldconfig. Here is my current config file. Maybe you can build a kernel and try to crash grub with it and the above commands.

Time to stabilize grub-2 ? :p
Comment 20 Adam Randall 2010-07-24 18:44:18 UTC
All of my kernels are 2.6.32-r7, as I am waiting for more revisions of 2.6.34-r1 to come through the pipeline.

I build all my kernels from scratch each time, and don't use the oldconfig to make them. I do not use genkernel. Since these are servers, the hardware is mostly similar, with varying processors among them.

The only thing that I can note is that I use UUIDs in my /etc/fstab, but I don't know if that's even considered by grub since I'm not using UUIDs in the configuration there (could never get that to work with 0.97).

The grub configuration is minimalistic at best, with this as an example:

default saved
timeout 15
fallback 2

title Gentoo Linux 2.6.32-r7
root (hd0,0)
kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 panic=30

title Gentoo Linux 2.6.32-r7 (rescue)
root (hd0,0)
kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 init=/bin/bb panic=30

title Gentoo Linux 2.6.31-r10
root (hd0,0)
kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1

title Gentoo Linux 2.6.31-r10 (rescue)
root (hd0,0)
kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1 init=/bin/bb


What gets me is that this starts happening for my systems after a -r9 to -r10 update. What exactly changed between the two, and is there a way to see those changes somewhere documented?
Comment 21 Adam Randall 2010-07-24 20:00:08 UTC
(In reply to comment #6)
> Do you mind see if the following command crashes grub when you run it
> straightly in a terminal?
> 
> egrep \
>   -v
> '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)'
> \
>   "${grub_config}" | \
> /sbin/grub --batch \
>   --device-map="${dir}"/device.map
> 
> 
> If so do you mind give what just the following part gives you for kind of
> output:
> 
> egrep \
>   -v
> '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)'
> \
>   "${grub_config}"
> 

The combination of piping in the egrep into grub causes the segfault for me. My egrep output is:

# egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "/boot/grub/menu.lst"
root (hd0,0)
kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 panic=30
root (hd0,0)
kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 init=/bin/bb panic=30
root (hd0,0)
kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1
root (hd0,0)
kernel /boot/linux-2.6.31-gentoo-r10 root=/dev/sda1 init=/bin/bb

I could not figure out how to pipe the data from egrep into grub via gdb. However, the direct output of it without gdb was:

# egrep -v '^[[:space:]]*(#|$|default|fallback|initrd|password|splashimage|timeout|title)' "/boot/grub/menu.lst" | /sbin/grub --batch --device-map=/boot/grub/device.map


    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

 [ Minimal BASH-like line editing is supported.  For the first word, TAB
   lists possible command completions.  Anywhere else TAB lists the possible
   completions of a device/filename. ]
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0x83
grub> kernel /boot/kernel-2.6.32-gentoo-r7 root=/dev/sda1 panic=30
   [Linux-bzImage, setup=0x2e00, size=0x404a80]
Segmentation fault
Comment 22 Adam Randall 2010-07-24 20:53:13 UTC
Does this bug even affect anything? I cleared out my /boot/grub directory (saved the grub.conf first though). Reinstalled grub 0.97-r10, ran the emerge --config =grub-0.97-r10 which ran without issue, and then did grub-install --no-floppy /dev/sda.

After rebooting, it worked as expected.

I also noticed that if I went backwards, to 0.97-r9, and did the emerge --config =grub-0.97-r9 it also segfaulted the same way. Since I've always been at -r9 since I went to grub from lilo, maybe this is just a non-issue.

Comment 23 Fab 2010-07-25 08:13:05 UTC
Problem solved. The answer is in the src_unpack() function of the ebuild :

> # Grub will not handle a kernel larger than EXTENDED_MEMSIZE Mb as
> # discovered in bug 160801. We can change this, however, using larger values
> # for this variable means that Grub needs more memory to run and boot. For a
> # kernel of size N, Grub needs (N+1)*2.  Advanced users should set a custom
> # value in make.conf, it is possible to make kernels ~16Mb in size, but it
> # needs the kitchen sink built-in.
> local t="custom"
> if [[ -z ${GRUB_MAX_KERNEL_SIZE} ]] ; then
> 	case $(tc-arch) in
>		amd64) GRUB_MAX_KERNEL_SIZE=7 ;;
>		x86)   GRUB_MAX_KERNEL_SIZE=3 ;;
>	esac
>	t="default"
> fi



> $ ls -lh /boot | grep 'kernel'
> -rw-r--r-- 1 root root 3,1M 28 juin  11:14 kernel-2.6.31-gentoo-r10
> -rw-r--r-- 1 root root 3,3M 17 juil. 12:22 kernel-2.6.34-gentoo-r1


> $ tail -n 3 /etc/make.conf
> 
> GRUB_MAX_KERNEL_SIZE=4

And the crash disappeared.
Comment 24 Fab 2010-07-25 09:25:25 UTC
In my make.conf I have these variables set :

> PORTAGE_ELOG_CLASSES="log warn error qa"
> PORTAGE_ELOG_SYSTEM="mail mail_summary save_summary"

I don't want the 'info' class because most of the time my mailbox is spammed with epatch messages. Portage should handle 2 levels of einfo messages :
 - first level pour internal emerge things (* Running eautoreconf in foo, * Applying bar.patch, ...)
 - second level for informations that developers wants to communicate

Anyway, maybe you should ewarn this explanation about kernel size that grub can handle. Recent kernels grow up.
Comment 25 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-07-30 19:58:32 UTC
I've raised the default GRUB_MAX_KERNEL_SIZE value some more.

A word of warning to anybody trying to use it on low-memory systems however, if you raise it too far, GRUB will refuse to run.
Comment 26 Adam Randall 2010-07-30 20:00:50 UTC
(In reply to comment #23)
> Problem solved. The answer is in the src_unpack() function of the ebuild :
> 
> > # Grub will not handle a kernel larger than EXTENDED_MEMSIZE Mb as
> > # discovered in bug 160801. We can change this, however, using larger values
> > # for this variable means that Grub needs more memory to run and boot. For a
> > # kernel of size N, Grub needs (N+1)*2.  Advanced users should set a custom
> > # value in make.conf, it is possible to make kernels ~16Mb in size, but it
> > # needs the kitchen sink built-in.
> > local t="custom"
> > if [[ -z ${GRUB_MAX_KERNEL_SIZE} ]] ; then
> > 	case $(tc-arch) in
> >		amd64) GRUB_MAX_KERNEL_SIZE=7 ;;
> >		x86)   GRUB_MAX_KERNEL_SIZE=3 ;;
> >	esac
> >	t="default"
> > fi
> 
> 
> 
> > $ ls -lh /boot | grep 'kernel'
> > -rw-r--r-- 1 root root 3,1M 28 juin  11:14 kernel-2.6.31-gentoo-r10
> > -rw-r--r-- 1 root root 3,3M 17 juil. 12:22 kernel-2.6.34-gentoo-r1
> 
> 
> > $ tail -n 3 /etc/make.conf
> > 
> > GRUB_MAX_KERNEL_SIZE=4
> 
> And the crash disappeared.
> 

I added GRUB_MAX_KERNEL_SIZE=5 to my make.conf file and it didn't seem to make any sort of difference. Is there something special that needs to be done for make.conf to recognize the increased value?
Comment 27 Robin Johnson archtester Gentoo Infrastructure gentoo-dev Security 2010-07-30 20:26:36 UTC
randalla:
The new defaults are x86=5, amd64=9. Emerge --sync, then recompile grub, and try again. If it still persists, include your emerge --info output here.