Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 36328 - Booting with openmosix-sources-2.4.22-r2 damages
Summary: Booting with openmosix-sources-2.4.22-r2 damages
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: x86 Linux
: High normal (vote)
Assignee: Gentoo Cluster Team
Depends on:
Reported: 2003-12-22 17:35 UTC by Ricardo Ferreira
Modified: 2010-09-10 18:59 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---

The kernel config for 2.4.22-openmosix-r2 (.config,35.54 KB, text/plain)
2003-12-22 17:39 UTC, Ricardo Ferreira

Note You need to log in before you can comment on or make changes to this bug.
Description Ricardo Ferreira 2003-12-22 17:35:14 UTC
Just booting into the mentioned kernel and waiting for the init process to end renders the lib unusable and by consequence also emerge (depends on python which depends on said lib).

Have to copy the lib over from another box to fix this.

Reproducible: Always
Steps to Reproduce:
1.Boot into openmosix-sources-2.4.22-r2
2.Wait for full init process to end
3.Run python

Actual Results:  
Python failed because it couldnt load the lib 

Expected Results:  
Worked just as it does under vanilla-sources-2.4.22 (what i was running 
Comment 1 Ricardo Ferreira 2003-12-22 17:39:16 UTC
Created attachment 22563 [details]
The kernel config for 2.4.22-openmosix-r2
Comment 2 Ricardo Ferreira 2003-12-22 17:41:33 UTC
Here is emerge info when running 2.4.22 vanilla (as i cant run emerge on openmosix).

Portage 2.0.49-r15 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r3, 2.4.22)
System uname: 2.4.22 i686 Intel(R) Xeon(TM) CPU 2.40GHz
Gentoo Base System version
distcc 2.11.1 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.3 [enabled]
CFLAGS="-march=i686 -O2 -pipe"
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/cvs/share/config /usr/kde/3.1/share/config /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/config"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-march=i686 -O2 -pipe"
FEATURES="sandbox autoaddcvs ccache distcc notitles"
USE="x86 oss apm avi crypt cups encode foomaticdb gif jpeg libg++ mad mikmod mpeg ncurses nls pdflib png quicktime spell truetype xml2 xmms xv zlib gtkhtml alsa gdbm berkdb slang readline arts tetex aalib bonobo svga ggi tcltk java guile mysql X sdl gpm tcpd pam libwww ssl perl python esd imlib oggvorbis gnome gtk qt kde motif opengl mozilla gphoto2 cdr scanner gtk2"
Comment 3 Michael Imhof (RETIRED) gentoo-dev 2003-12-27 03:49:57 UTC
Could you investigate a little bit more?
I can't reproduce this problem over here.
Comment 4 Ricardo Ferreira 2003-12-27 08:08:18 UTC
Well, believe it or not, i left for the holidays and now i cant reproduce it either.

I'll say something if it appears again. I'll be putting openmosix-sources in other machines shortly.

This machine is a 2x P4-XEON 2.4Ghz with HT enabled if it matters.
Comment 5 Michael Imhof (RETIRED) gentoo-dev 2003-12-27 08:35:44 UTC
good. so i'll close this bug now. if you experience any problems feel free to open another bug.
Comment 6 Ricardo Ferreira 2003-12-27 20:48:00 UTC
Hmm, just happened again. I rebooted, chose the openmosix kernel like before & after startup all programs that require were segfaulting.

Copy a spare i kept on another dir (on the same machine) over the system one, and everything is working again. So as far as i can tell something in the init system / init scripts is messing with that lib.

To note that stopping/starting openmosix after the lib is busted has no effect, i have to copy it over.
Comment 7 Ricardo Ferreira 2003-12-27 21:39:10 UTC
Note, that even after i fix it by copying the lib over, emerge still segfaults occasionally. I get occasional segfaults everywhere that uses C++.
Comment 8 Michael Imhof (RETIRED) gentoo-dev 2003-12-28 01:58:08 UTC
Have you looked into the follwing bugs? and

Most of the problems are due to march/mcpu and non homogenous machines.
Read through the bugs, try out the solutions provided there and report back if this solves your problem.
Comment 9 Ricardo Ferreira 2003-12-28 12:31:23 UTC
I'm still having problems with Even before trying all the proposed solutions.

Surely a mismatch of capabilities/arch couldnt cause the following:

1. Copy known copy of over the systems /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/
2. Reboot a few times.
3. Every time i reboot, even if the lib can be loaded (not always) it always comes up with a different md5sum from the lib i originally copied over it. Even if i give it the +i (Imutable) attrib with chattr.

What could be causing this ? It only happens on this machine. On the other one i only get the "normal" emerge segfaults. 
Comment 10 Ricardo Ferreira 2003-12-29 09:15:17 UTC
The problem has been fixed. It was /etc/init.d/hdparm. Dont ask me why but one of my HDDs didnt like what hdparm did and kept modifying the lib. HW problem.

Now i only get the occasional segfault.
Even after emerge -e system with "-mcpu=i686 -march=i686 -O2 -pipe" as CFLAGS.

The machines are:
1. 2 x XEON 2.4Ghz with HT enabled. 512MB DDR
2. Duron 850Mhz. 256MB SDRAM

With those cflags, gcc shouldnt use any special features that one of the machines doesnt have. Or will it ?
Comment 11 Ricardo Ferreira 2004-01-04 16:16:47 UTC
Ok, changed CFLAGS to "-mcpu=i686 -O2 -pipe". CHOST is "i686-pc-linux-gnu".

emerge -e system && emerge -e world

and emerge still segfaults on the slower machine (the Duron). But it doesnt segfault if i tell OM to run it on the XEON node:

mosrun -2 emerge

2 is the XEON node id. If the problem is differing capabilities on the CPUs then it should segfault.

Really am out of ideas.
Comment 12 Michael Imhof (RETIRED) gentoo-dev 2004-01-05 02:53:49 UTC
The problem only occurs if e.g.
you want to compile something for the xeon, but openmosix balances the gcc calls to be executed on the duron, then it segfaults.

I asume node 1 is the duron, node 2 the xeon.
You write when you run it on the xeon, then it works like a charm.
Try running it on the duron (mosrun -1) and look if it segfaults then.

The xeon has all capabilities of the duron PLUS some other. So running anything on the xeon should work. Running some things on the duron may segfault because of missing capabilities.
Look into /proc/cpuinfo:

flags           : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 tm pbe tm2 est

Those flags (just an example over here) should differ on your nodes, where the xeon has more flags as the duron.
Comment 13 Ricardo Ferreira 2004-01-05 17:49:35 UTC
It doesnt need to be running a compile to segfault. Just running emerge with no args segfaults sometimes.

2: Xeon
5: Duron
(i use these ids because they're derived from the IP adress)

On the Xeon node:

mosrun -5 emerge sync -> No segfault
mosrun -2 emerge sync -> No segfault
emerge sync           -> No segfault

On the Duron node:

mosrun -2 emerge sync -> No segfault
mosrun -5 emerge sync -> No segfault
emerge sync           -> Segfault

.. Note that it might not segfault right away. But it will eventually segfault along the sync process.