Just booting into the mentioned kernel and waiting for the init process to end renders the lib unusable and by consequence also emerge (depends on python which depends on said lib). Have to copy the lib over from another box to fix this. Reproducible: Always Steps to Reproduce: 1.Boot into openmosix-sources-2.4.22-r2 2.Wait for full init process to end 3.Run python Actual Results: Python failed because it couldnt load the libstdc++.so.5 lib Expected Results: Worked just as it does under vanilla-sources-2.4.22 (what i was running before)
Created attachment 22563 [details] The kernel config for 2.4.22-openmosix-r2
Here is emerge info when running 2.4.22 vanilla (as i cant run emerge on openmosix). Portage 2.0.49-r15 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r3, 2.4.22) ================================================================= System uname: 2.4.22 i686 Intel(R) Xeon(TM) CPU 2.40GHz Gentoo Base System version 1.4.3.10 distcc 2.11.1 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] ccache version 2.3 [enabled] ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-march=i686 -O2 -pipe" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/cvs/share/config /usr/kde/3.1/share/config /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/config" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-march=i686 -O2 -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="sandbox autoaddcvs ccache distcc notitles" GENTOO_MIRRORS=" http://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ ftp://ftp.belnet.be/mirror/rsync.gentoo.org/gentoo/ http://212.219.56.162/sites/www.ibiblio.org/gentoo/ http://194.83.57.2/sites/www.ibiblio.org/gentoo/ http://194.83.57.3/sites/www.ibiblio.org/gentoo/" MAKEOPTS="-j12" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 oss apm avi crypt cups encode foomaticdb gif jpeg libg++ mad mikmod mpeg ncurses nls pdflib png quicktime spell truetype xml2 xmms xv zlib gtkhtml alsa gdbm berkdb slang readline arts tetex aalib bonobo svga ggi tcltk java guile mysql X sdl gpm tcpd pam libwww ssl perl python esd imlib oggvorbis gnome gtk qt kde motif opengl mozilla gphoto2 cdr scanner gtk2"
Could you investigate a little bit more? I can't reproduce this problem over here.
Well, believe it or not, i left for the holidays and now i cant reproduce it either. I'll say something if it appears again. I'll be putting openmosix-sources in other machines shortly. This machine is a 2x P4-XEON 2.4Ghz with HT enabled if it matters.
good. so i'll close this bug now. if you experience any problems feel free to open another bug.
Hmm, just happened again. I rebooted, chose the openmosix kernel like before & after startup all programs that require libstdc++.so.5.0.3 were segfaulting. Copy a spare i kept on another dir (on the same machine) over the system one, and everything is working again. So as far as i can tell something in the init system / init scripts is messing with that lib. To note that stopping/starting openmosix after the lib is busted has no effect, i have to copy it over.
Note, that even after i fix it by copying the lib over, emerge still segfaults occasionally. I get occasional segfaults everywhere that uses C++.
Have you looked into the follwing bugs? http://bugs.gentoo.org/show_bug.cgi?id=27615 and http://bugs.gentoo.org/show_bug.cgi?id=17343 Most of the problems are due to march/mcpu and non homogenous machines. Read through the bugs, try out the solutions provided there and report back if this solves your problem.
I'm still having problems with libstdc++.so.5.0.3. Even before trying all the proposed solutions. Surely a mismatch of capabilities/arch couldnt cause the following: 1. Copy known copy of libstdc++.so.5.0.3 over the systems /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/libstdc++.so.5.0.3 2. Reboot a few times. 3. Every time i reboot, even if the lib can be loaded (not always) it always comes up with a different md5sum from the lib i originally copied over it. Even if i give it the +i (Imutable) attrib with chattr. What could be causing this ? It only happens on this machine. On the other one i only get the "normal" emerge segfaults.
The libstdc++.so problem has been fixed. It was /etc/init.d/hdparm. Dont ask me why but one of my HDDs didnt like what hdparm did and kept modifying the lib. HW problem. Now i only get the occasional segfault. Even after emerge -e system with "-mcpu=i686 -march=i686 -O2 -pipe" as CFLAGS. The machines are: 1. 2 x XEON 2.4Ghz with HT enabled. 512MB DDR 2. Duron 850Mhz. 256MB SDRAM With those cflags, gcc shouldnt use any special features that one of the machines doesnt have. Or will it ?
Ok, changed CFLAGS to "-mcpu=i686 -O2 -pipe". CHOST is "i686-pc-linux-gnu". Did: emerge -e system && emerge -e world and emerge still segfaults on the slower machine (the Duron). But it doesnt segfault if i tell OM to run it on the XEON node: mosrun -2 emerge 2 is the XEON node id. If the problem is differing capabilities on the CPUs then it should segfault. Really am out of ideas.
The problem only occurs if e.g. you want to compile something for the xeon, but openmosix balances the gcc calls to be executed on the duron, then it segfaults. I asume node 1 is the duron, node 2 the xeon. You write when you run it on the xeon, then it works like a charm. Try running it on the duron (mosrun -1) and look if it segfaults then. The xeon has all capabilities of the duron PLUS some other. So running anything on the xeon should work. Running some things on the duron may segfault because of missing capabilities. Look into /proc/cpuinfo: flags : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 tm pbe tm2 est Those flags (just an example over here) should differ on your nodes, where the xeon has more flags as the duron.
It doesnt need to be running a compile to segfault. Just running emerge with no args segfaults sometimes. 2: Xeon 5: Duron (i use these ids because they're derived from the IP adress) On the Xeon node: mosrun -5 emerge sync -> No segfault mosrun -2 emerge sync -> No segfault emerge sync -> No segfault On the Duron node: mosrun -2 emerge sync -> No segfault mosrun -5 emerge sync -> No segfault emerge sync -> Segfault .. Note that it might not segfault right away. But it will eventually segfault along the sync process.