Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 216990 - Installer (GTK+) Corrupts Grub Folder on Primary Hard Drive?
Summary: Installer (GTK+) Corrupts Grub Folder on Primary Hard Drive?
Status: RESOLVED INVALID
Alias: None
Product: Gentoo Release Media
Classification: Unclassified
Component: Installer (show other bugs)
Hardware: All Linux
: High blocker (vote)
Assignee: Gentoo Linux Installer
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-04-09 08:20 UTC by Roger
Modified: 2008-04-09 23:27 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Roger 2008-04-09 08:20:26 UTC
This is going to be interesting for me to document properly.  I'll start with the install scenario:

1) Primary SATA Hard Drive with standard swap (sda1)& root filesystem on a Reiserfs filesystem (sda2).
2) Secondary Hard Drive with Windows XP NTFS (sdb1), Linux EXT3 where Gentoo Beta1 2008.1 was installed (labeled as sdb3 according to libsata) and Archive ReiserFS partition for use by the default o/s on the primary hard drive(/dev/sdb2).

Grub is installed on sda with stage 2 on /dev/sda2 (root filesystem).

After completely installing, I noticed no option for choosing between grub or lilo (no biggy) and *no* option for customizing the Grub menu.lst file.  This was extremely important to see with my listed above layout -- being as unique as it is, I did not expect most installers to get it correct.

Also noticed, the GTK+ installer had log for viewing, but had the applet button which complained of no log available -- believe a bug is already filed for this.

After the install finalized, I could have swore to have properly shutdown the system.  However, on reboot, I was presented with a blank grub menu.

Rebooted with an recent Gentoo Minimal CD and noticed my root filesystem /boot was not present!  Not only this, but fsck.reiserfs also required the rebuild tree option!  Completely shocked at this point as these two drives are not more then a year old and are Seagate drives!  Never seen anything like this at all until this install with 2008.0 beta1!



Reproducible: Didn't try

Actual Results:  
I'm still in the process of cleaning things up.  My entire /boot folder is missing and pushed into lost+found.  Noticed awk was missing on reboot and init.d/rc scripts are screwed still.


I'm marking this as a blocker.  However, reproducing this bug is going to be more then likely pending on other testers seeing a similar scenario for now.

Of my 10 years on Linux, this is probably the worst bug I've seen with an installer so far.  I extremely doubt it's due to bad blocks and somehow more due to an improper umount of the filesystems.  But even I haven't seen corruption to this extent on my accidental shutdowns in the past.
Comment 1 Roger 2008-04-09 08:30:56 UTC
I'll do my best to follow-up on this bug when others post here, but my production box is now down.  This was a pretty darn stable box too!
Comment 2 Roger 2008-04-09 09:54:39 UTC
OK.  I'm production host is back up.

Here's what was missing

 - Entire /boot folder
 - Bin files awk, basename, umount, (depscan.sh?), mktemp, hostname, loadkeys

<shrugs>  All I've got for now.  I could be looking at two separate bugs here.  One filesystem related or other source.  And another with the Installer with it's grub install code.  It doesn't look like the Installer touched the boot folder at all on /dev/sda2 here, I'm also wondering if /dev/sda2 was even mounted during the install to /dev/sdb3?

Only thing I can think of, if more files are missing, re-emerge world.  Also, even though unlikely, watch for a failing /dev/sda -- one never knows!
Comment 3 Roger 2008-04-09 10:00:20 UTC
Oh wow!  The partition I installed the 2008.0 beta1 to, I get a trillion of the following prompts <y>:

--- snip ---
# fsck.ext3 /dev/sdb3
e2fsck 1.40.6 (09-Feb-2008)
Resize inode not valid.  Recreate<y>? yes

/dev/sdb3 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Root inode is not a directory.  Clear<y>? yes

Reserved inode 4 (<The ACL data inode>) has invalid mode.  Clear<y>? yes

Inode 4, i_blocks is 114024, should be 0.  Fix<y>? yes

Reserved inode 6 (<The undelete directory inode>) has invalid mode.  Clear<y>? yes

Inode 6, i_blocks is 32, should be 0.  Fix<y>? yes

Inode 8, i_blocks is 0, should be 262416.  Fix<y>? 
--- snip ---

Somehow, I seriously doubt I have two failing (~150G/~350G)Seagate drives at the same time!

fsck is still going using '-y' option.  Somehow I doubt the install of 2008.0 is even usable at this point.

<shakes head> First I've seen a install fail so miserably!
Comment 4 Roger 2008-04-09 10:18:56 UTC
--- snip ---
#mount /dev/sdb3 /mnt/test

#ls /mnt/test/
lost+found

# du -kh --max-depth=1
995M    ./lost+found
--- snip ---

LOL!
Comment 5 Preston Cody (RETIRED) gentoo-dev 2008-04-09 10:29:11 UTC
Hrm, ok a couple things to note here.
1. We don't support ReiserFS, and never will.
2. The advanced mode of the command-line installer will let you choose which drive 's MBR to install GRUB to.  I think this is what you want, but I can't tell for sure.
3. I would recommend reinstalling rather than trying to recover.  You'll save time.

I think that the grub issue isn't related to some filesystem issue you're having.  Perhaps you chose the wrong filesystem and that is why you're seeing funky things?
Comment 6 Roger 2008-04-09 11:22:11 UTC
(In reply to comment #5)
> Hrm, ok a couple things to note here.
> 1. We don't support ReiserFS, and never will.

I don't understand this one. ReiserFS is provided by gentoo-sources and reiserfs is an option within the installer.

> 2. The advanced mode of the command-line installer will let you choose which
> drive 's MBR to install GRUB to.  I think this is what you want, but I can't
> tell for sure.

More then likely, probably does.  However, a person starting the install probably isn't going to know whether or not the installer is going to give them the opportunity to customize the Grub install module of the installer.  (Past history for popular Linux distro installers shows, do allow customization whether or not expert mode was initially specified.  Mainly, for non-expert mode install, the option is there during the grub install to make sure the menu.lst or lilo.conf file is in order.)

<shrugs>  Bug #208396 demonstrates this.

> 3. I would recommend reinstalling rather than trying to recover.  You'll save
> time.

If you noted within comment #4, that was the root filesystem of the install of 2008.0 beta1, for which, the entire root filesystem was moved to lost+found!      

BTW, I did use ext3 for 2008.0 beta1 install.  Maybe this is what you were referring to, concerning supported file systems for 2008.0 beta1 -- only ext3 is currently supported?  Still, I vividly recall reiserfs & ext3 both being options.
 
 
> I think that the grub issue isn't related to some filesystem issue you're
> having.  Perhaps you chose the wrong filesystem and that is why you're seeing
> funky things?

no.  fdisk/cfdisk reported the filesystem as ext3.  fsck.ext3 would probably not have proceeded if it were not an ext3 filesystem.

I *should* mention here, I used the livecd-i686-installer-2008.0_beta1.iso.

My only guess currently is, maybe the kernel segfaulted during or just prior to umount.  Or the installer grub-install segfaulted.  Even then, this must've been the biggest segfault hitting just the right G spot on this box!  I've never seen this massive data corruption.  Another thought is, libsata stability issue.  Right now, I'm kind of stumped how I can replicate this ... safely.  Using my spare IDE drives to replicate this is going to loop around using SATA drivers, etc.

For now, I'll kick-back and relax and hopefully somebody else will encounter this issue & hopefully provide a tidbit more data.



Comment 7 Roger 2008-04-09 11:31:09 UTC
In the past, I've had to make sure I compiled grub using no optimizations (only -O2).  Using optimizations on my SMP Pentium3 box, and grub got cranky.  Usually only refusing/error to install to MBR.

I don't want to speculate any further without specific logged debug data leading to, libsata/grub could be acting up.

...mmm.. wait a second, I did use an i686 livecd.  (Still need more data to verify this.  I can't risk installing on this box again. If I get time, I'll install on my pentium3 laptop.)
Comment 8 Andrew Gaffney (RETIRED) gentoo-dev 2008-04-09 13:07:02 UTC
The installer does not touch *any* partition that you don't tell it to. If you've got filesystem corruption all over, something completely unrelated to the installer happened. We can't help you.
Comment 9 Roger 2008-04-09 23:27:21 UTC
<shrugs>  what help?

Wasn't even asking for help, but to only document a critical/severe bug I encountered.  (To my knowledge, this is a bug reporting system and not a help channel.)

A more proper solution for closing this bug, would be to "postpone for more information needed" (obviously due to the lack of log feature the installer was unable to pull after install) or pushed to 2008.0 release team for info (as there could be a bug with libsata/grub).

(... was considering spending a few hours looking at the installer code and helping.  Leaving as closed due to the inability to attain relevant log data.)