For some time now I was unable to build a working kernel. Compilation goes fine and a kernel image is created. Grub finds a compressed kernel image. Right after having the kernel loaded (instantly) the computer reboots without further information. I cannot even read the uncompressing message. I have several systems AMD64/EM64T and IA32 that all have the same problem. I used gcc-4.4.0, gcc-4.4.1 and gcc-4.3.3-r2, additionally i downgraded binutils to 2.18 as I found a similiar debian-bug, but that didn't solve the problem either. I tried to build kernel 2.6.16-r13 (unconfigured) but the problem stayed, no panic, just reboot. I have not changed the kernel configuration, the last working kernel was 2.6.30-r1, the same config and built kernel later would not boot. Reproducible: Always Steps to Reproduce: 1. configure kernel (or not, does not matter) 2. build and install kernel 3. reboot Actual Results: System does not boot properly instead reboots immediately Expected Results: System boots Portage 2.2_rc33 (default/linux/amd64/2008.0, gcc-4.4.1, glibc-2.10.1-r0, 2.6.30-gentoo-r1 x86_64) ================================================================= System uname: Linux-2.6.30-gentoo-r1-x86_64-AMD_Athlon-tm-_64_X2_Dual_Core_Processor_5200+-with-gentoo-2.0.1 Timestamp of tree: Mon, 27 Jul 2009 07:45:01 +0000 app-shells/bash: 4.0_p28 dev-java/java-config: 2.1.8-r1 dev-lang/python: 2.6.2-r1 dev-util/cmake: 2.6.4-r1 sys-apps/baselayout: 2.0.1 sys-apps/openrc: 0.4.3-r3 sys-apps/sandbox: 2.0 sys-devel/autoconf: 2.13, 2.63-r1 sys-devel/automake: 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2, 1.11 sys-devel/binutils: 2.19.1-r1 sys-devel/gcc-config: 1.4.1 sys-devel/libtool: 2.2.6a virtual/os-headers: 2.6.30-r1 ACCEPT_KEYWORDS="amd64 ~amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-pipe -Os -march=athlon64" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/share/config" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/eselect/postgresql /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d" CXXFLAGS="-pipe -Os -march=athlon64" DISTDIR="/gentoo/portage/distfiles" EMERGE_DEFAULT_OPTS="--with-bdeps y" FEATURES="distlocks fixpackages parallel-fetch preserve-libs protect-owned sandbox sfperms strict unmerge-orphans userfetch" GENTOO_MIRRORS="http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/ http://mirror.gentoo.no/ http://ftp.uni-erlangen.de/pub/mirrors/gentoo/ http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ http://pandemonium.tiscali.de/pub/gentoo/ ftp://ftp.join.uni-muenster.de/pub/linux/distributions/gentoo/ ftp://ftp.tu-clausthal.de/pub/linux/gentoo/ ftp://ftp.wh2.tu-dresden.de/pub/mirrors/gentoo/distfiles/" LC_ALL="en_US.UTF-8" LDFLAGS="-Wl,-O1" LINGUAS=" en de" MAKEOPTS="-j3" PKGDIR="/gentoo/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/gentoo/tmp" PORTDIR="/gentoo/portage" SYNC="rsync://rsync.de.gentoo.org/gentoo-portage" USE="3dnow 3dnowext X aac acl acpi alsa amd64 bash-completion berkdb bzip2 cli cracklib crypt cups dbus dri flac fortran gdbm gif gpm htmlhandbook iconv ipv6 isdnlog jpeg kde lm_sensors midi mmx mmxext modplug mp2 mp3 mudflap multilib ncurses nls nptl nptlonly opengl openmp pam pcre perl plasma png pppd python readline reflection session smp spell spl sse sse2 ssl svg sysfs tcpd threads tiff truetype unicode vdpau vim-syntax vorbis wma wmf x264 xcomposite xinerama xorg xscreensaver xv xvid xvmc zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES=" keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS=" en de" USERLAND="GNU" VIDEO_CARDS=" fbdev nv nvidia vesa vga" Unset: CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LANG, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, PORTDIR_OVERLAY
Created attachment 199339 [details] Emerge log generated by genlop -l for my system This is the emerge log of my system. The date from which on the problem occured is beginning of July, the last working kernel was emerged on June, 12th 2009. The kernel image has the date June 15th 2009. At least at the merge of gentoo-sources-2.6.30-r2 the problem occured (July 9th 2009)
July 9 was the day when the 2.6.30 kernel sources emerged, but are you sure that you built the kernel image the same day and not the next day, which happened to be a big toolchain upgrade? Do you still have the kernel image dated July 9?
On another machine I still have an image of 2.6.30-r2 (non functional) dated to July 5th 2009. The 2.6.30-r1 is from June 15th 2009. I think the system was on update level from June 10th 2009.
You're not running exactly the same version of gcc-4.3 as before; so my current theory is that both gcc-4.4.1 and gcc-4.3.3-r2 happen to be breaking your kernel build. How about trying this: Emerge a known-good compiler, venerable gcc-4.1.2 Remove your 2.6.30-r1 source tree Unpack clean 2.6.30-r1 source tree Compile with gcc-4.1.2 and your old config Hopefully, boot successfully
The 2.6.30-r1 was build both times with 4.3.3. The earlier version boots, the later not. But anyway I tried your steps, build a completely fresh emerged 2.6.30-r4 with gcc-4.1.2, without success -> immediate reboot.
It looked like earlier compiles used 4.3.3, and later used 4.3.3-r2 which should be ALMOST the same, but perhaps a rev bump introduced a subtle regression... Ok, how about going back to a 2.6.29 kernel, with your gcc 4.1. Downgrade the binutils if you need to. That one had better work. Then, once you are back to a working baseline, you can do a git bisect to see what commit is breaking your systems.
Over the weekend I did a complete rollback to a stable gentoo. Only things I did not revert were portage, python, boost and (because it really argued about it) glibc. Portage because I like sets, boost did not build in version 1.35 and python _should_ really not interfere with the kernel build, as on other systems it is not needed to build the kernel. I did, after the downgrade "emerge -e system" and after that built kernel 2.6.29-r5. It did an immediate reboot. As of now, kernel 2.6.30-r4 seems to have stabilized, but that did not build either. Maybe I could downgrade glibc, has anyone some experience with this, is it sure to crash my system? (In a way, it is crashed somehow anyway).
Wow this is absolutely puzzling. You can't build working kernels any more using the old config and either gcc 4.1 or 4.3. You say this same problem is happening on several systems, so you can't build decent kernels on any of them after the upgrade? Er, would you believe it was a solar flare that zapped the disk controllers on all your systems so they subtly scramble /usr/src/linux contents? Nope, didn't think so... I'll go ahead and assign this to the kernel team, hoping somebody there will have more ideas because I'm stumped right now.
have you tried with the newest kernel, if yes post your .config
The first computer I had to erase, I will from now on use one of my home computers that has (as all of my gentoo systems) the problem. I updated the system up to date August 10th, 2009 and rebuild @system. This system never had a gcc >=4.4. So current compiler is gcc-4.3.4. I removed the build tree of kernel 2.6.30-r4 and reemerged it so to have a clean tree. Then "make mrproper" and copied the appended config. After "make oldconfig" the kernel was build. I copied the kernel from arch/x86_64/boot/bzImage to /boot, entered it in the grub.conf and rebooted. The system hangs now right after I hit enter on the particular entry. I shows the grub informations on the kernel and file system and then nothing more. Not even "Decompressing ...". emerge --info on this system gives: Portage 2.2_rc38 (default/linux/amd64/2008.0, gcc-4.3.4, glibc-2.10.1-r0, 2.6.30-gentoo-r1 x86_64) ================================================================= System uname: Linux-2.6.30-gentoo-r1-x86_64-Intel-R-_Core-TM-2_CPU_6300_@_1.86GHz-with-gentoo-2.0.1 Timestamp of tree: Mon, 10 Aug 2009 17:20:01 +0000 distcc 3.1 x86_64-pc-linux-gnu [disabled] app-shells/bash: 4.0_p28 dev-java/java-config: 2.1.8-r1 dev-lang/python: 2.6.2-r1, 3.1 dev-util/cmake: 2.6.4-r2 sys-apps/baselayout: 2.0.1 sys-apps/openrc: 0.4.3-r3 sys-apps/sandbox: 2.0 sys-devel/autoconf: 2.13, 2.63-r1 sys-devel/automake: 1.7.9-r1, 1.9.6-r2, 1.10.2, 1.11 sys-devel/binutils: 2.19.1-r1 sys-devel/gcc-config: 1.4.1 sys-devel/libtool: 2.2.6a virtual/os-headers: 2.6.30-r1 ACCEPT_KEYWORDS="amd64 ~amd64" CBUILD="x86_64-pc-linux-gnu" CFLAGS="-pipe -Os -march=core2" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c /etc/udev/rules.d" CXXFLAGS="-pipe -Os -march=core2" DISTDIR="/mnt/sda7/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--with-bdeps y" FEATURES="assume-digests distlocks fixpackages parallel-fetch preserve-libs protect-owned sandbox sfperms strict unmerge-logs unmerge-orphans userfetch" GENTOO_MIRRORS="ftp://ftpl.tu-chemnitz.de/pub/linux/gentoo/ http://gentoo.osuosl.org/ http://ftp.uni-erlangen.de/pub/mirrors/gentoo/ http://mirrors.sec.informatik.tu-darmstadt.de/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ http://pandemonium.tiscali.de/pub/gentoo/ ftp://ftp.tu-clausthal.de/pub/linux/gentoo/ ftp://ftp.join.uni-muenster.de/pub/linux/distributions/gentoo/ ftp://ftp.wh2.tu-dresden.de/pub/mirrors/gentoo/distfiles/ http://linux.rz.ruhr-uni-bochum.de/download/gentoo-mirror/" LDFLAGS="-Wl,-O1" LINGUAS=" en de" MAKEOPTS="--jobs=3" PKGDIR="/usr/portage/packages" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.de.gentoo.org/gentoo-portage" USE="7zip X Xaw3d a52 aac acl acpi aim alsa amd64 arts avi bash-completion berkdb bittorrent bluetooth bzip2 cairo cdparanoia cdr cli cracklib crypt cups curl dbus dri dts dvb dvd dvdr dvdread dvi encode expat fbcon ffmpeg flac foomaticdb fortran ftp gdbm geoip gif gimp gphoto2 gpm graphviz gs iconv icq ieee1394 imap ipv6 isdnlog jabber java jpeg jpeg2k kde kdehiddenvisibility latex lm_sensors mad matroska mbox mikmod mime mmx mng modplug mono mozilla mp3 mpeg msn mudflap multilib ncurses nls nptl nptlonly nsplugin ogg oggvorbis openal opengl openmp pam pcntl pcre pdf perl png posix postgres pppd python quicktime readline reflection samba sdl seamonkey session smp snmp speex spell spl sse sse2 ssl ssse3 subversion svg sysfs sysvipc szip tcpd tetex tga theora threads threadsafe tiff truetype unicode usb vcd videos vorbis wma wmf x264 xanim xcomposite xine xinerama xml2 xorg xpm xv xvid xvmc yahoo zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES=" keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS=" en de" USERLAND="GNU" VIDEO_CARDS=" fbdev vesa mga nv radeon radeonhd" Unset: CPPFLAGS, CTARGET, FFLAGS, INSTALL_MASK, LANG, LC_ALL, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Created attachment 200891 [details] Configuration of a non-working kernel This is the configuration of my kernel. Its only difference to the config of 2.6.30-r1 are the date and a stack size option, that was introduced after 2.6.30-r1. But as 2.6.30-r1 also does not build anymore I think that is not important.
(In reply to comment #10) > kernel was build. I copied the kernel from arch/x86_64/boot/bzImage to /boot, > entered it in the grub.conf and rebooted. The system hangs now right after I > hit enter on the particular entry. I shows the grub informations on the kernel > and file system and then nothing more. Not even "Decompressing ...". > Sorry for this dumb question but you were detailing your steps and I have to ask. You copied over the System.map file, also, right?
My steps after kernel build: cp arch/x86_64/boot/bzImage /boot/kernel-<version> cp System.map /boot/System.map-<version> make modules_install make firmware_install vim /boot/grub/grub.conf (to enter the new kernel) not necessarily (for this test I did it not): make clean make modules_prepare cp /boot/System.map-<version> System.map if the kernel is newly configured: cp .config ../config-<version> A grub.conf entry looks like: title GNU/Linux64-2.6.30-gentoo-r4 root (hd0,7) kernel (hd0,7)/boot/kernel-2.6.30-gentoo-r4 root=/dev/sda8 ro vga=0x317 All these steps have not changed since many kernel before. On this machine I have no out-of-kernel modules, so even preserving the System.map is not really necessary. Some packages want a configured kernel so i at least make modules_prepare.
Do you use JFS as your root filesystem? It seems to be the only main filesystem builtin. I was thinking of trying your config as-is with 2.6.30-gentoo-r4 on my own machine, but probably should turn on EXT3 since you didn't mention using initramfs.
Yes, all my systems use the JFS file system, so it is the only one built into the kernel. There is no initrd on my systems.
Well your kernel config does boot on my system, but I just noticed something that might be of interest: CONFIG_X86_PAT=y From the help for that option, "Say N here if you see bootup problems (boot crash, boot hang, spontaneous reboots) or a non-working video driver." Since spontaneous reboots is your symptom, how about turning this off?
I will try this option on my 64 bit machine on sunday. For now I can tell you, that I have tried it on a 32 bit machine (Thinkpad T42) and it still reboots. As I stated before, the configureation of the last working kernel version now also produces a rebooting kernel, so I doubt that it is directly linked to my config.
I have found the bad guy, it was gentoo-sources-2.6.30-r1. But one step after another: I updated the compiler to gcc-4.4.1 as this is now the testing gcc. Also I emerged gentoo-sources-2.6.30-r5. Don't ask how, but I got the idea of trying an older kernel and so I booted 2.6.29-r3. I built the new kernel and it booted nicely, initially I disabled the mentioned PAT option. With this new 2.6.30-r5 I rebuild using the PAT option and it also booted. Then I used gcc-4.3.4 to again build the kernel and it also worked. Finally I booted 2.6.30-r1 and also build the kernel and it did not boot. So my assumption is, that somehow the kernel interferes with the kernel build (and I guess that shouldn't be the case other than maybe getting the current architecture). So finally I again have a working kernel and the means to build a new one. Thanks for the comments on this. I am leaving the bug open as I want to verify that solution on my other installations.
I can confirm this "fix". i upgraded to gcc4.4.1, kernel was vanilla 2.6.30.1, i did not downgrade gcc. using the same .config (after make oldconfig) i could not build a working kernel. versions testet were vanilla 2.6.30.1/4/5. booting vanilla 2.6.29.1 and just make clean, make, make install for 2.6.30.5 the new kernel does boot.
I updated my other systems (IA-32) and it worked, so I suggest to close this bug. I have another interesting discovery though: I built a working kernel installed and it booted. Then I booted the defective 2.6.30-r1 and only copied the kernel-image again, using another image name, and it did not boot. The md5sum and sha1 checksum of both kernels is the same. So I thought the kernel somehow corrupts the copy-command. It think this is really interesting. After that I booted the working kernel and copied the corrupted image to a new name, and what would you say, it booted. So finally somehow the filesystem driver has to have been defect. Carsten, could you point out, which filesystem you had your kernel in for boot (thats only out of interest, I have no means to follow this further, but maybe somebody else wants to).
I use jfs like you.
I did some research in the kernel changelogs for 2.6.30.1 and found the following commit: commit 206f0f05bdc291a9358ba59248e2bc44e8b3127d Author: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Date: Tue Jun 16 13:43:22 2009 -0500 jfs: fix regression preventing coalescing of extents commit f7c52fd17a7dda42fc9e88c2b2678403419bfe63 upstream. Commit fec1878fe952b994125a3be7c94b1322db586f3b caused a regression in which contiguous blocks being allocated to the end of an extent were getting a new extent created. This typically results in files entirely made up of 1-block extents even though the blocks are contiguous on disk. Apparently grub doesn't handle a jfs file being fragmented into too many extents, since it refuses to boot a kernel from jfs that was created by the 2.6.30 kernel. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Reported-by: Alex <alevkovich@tut.by> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de> So I think that bug is gone.