Most apps couldn't be emerged under openmosix. When I stop openmosix everything is ok. Some apps COULD be emerged under openmosix if compilation process didn't migrate to other nodes or there is only one compilation process (MAKEOPTS="-j1" in make.conf file). Reproducible: Sometimes Steps to Reproduce: 1.star openmosix (/etc/init.d/openmosix start) 2.emerge "enything" (emerge vim) 3. Actual Results: emerge vim produce: (tail of emerge log) (...) !!! ERROR: app-editors/vim-6.3 failed. !!! Function src_compile, Line 260, Exitcode 2 !!! emake failed ------------------------------------------------ emerge screen produce: (tail of emerge log) (...) /usr/lib/portage/bin/emake: line 14: 16375 Segmentation fault make ${MAKEOPTS} ${EXTRA_EMAKE} "$@" !!! ERROR: app-misc/screen-4.0.2 failed. !!! Function src_compile, Line 84, Exitcode 139 !!! emake failed ------------------------------------------------ emerge distcc produce: (tail of emerge log) (...) make: *** read jobs pipe: No such file or directory. Stop. make: *** Waiting for unfinished jobs.... !!! ERROR: sys-devel/distcc-2.13-r1 failed. !!! Function src_compile, Line 67, Exitcode 2 !!! emake failed Gentoo Base System version 1.4.16 Portage 2.0.50-r9 (default-x86-2004.0, gcc-3.3.3, glibc-2.3.3.20040420- r0, .4.26-openmosix-r4) ================================================================= System uname: 2.4.26-openmosix-r4 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz distcc 2.13 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] Autoconf: sys-devel/autoconf-2.59-r3 Automake: sys-devel/automake-1.8.3 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-march=pentium4 -O3 -fomit-frame-pointer -mfpmath=sse" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/s hare/config /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/gconf:/etc/terminfo /etc/env.d" CXXFLAGS="-march=pentium4 -O3 -fomit-frame-pointer -mfpmath=sse" DISTDIR="/usr/portage/distfiles" FEATURES="autoaddcvs ccache sandbox" GENTOO_MIRRORS="http://gentoo.prz.rzeszow.pl http://gentoo.zie.pg.gda.pl ftp://mirrors.sec.i nformatik.tu-darmstadt.de/gentoo/ http://gentoo.oregonstate.edu http://www.ibiblio.org/pub/L inux/distributions/gentoo" MAKEOPTS="-j10" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="" SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage" USE="NPTL X alsa apache2 apm arts avi berkdb bzlib crypt cups encode esd flash foomaticdb gdbm gif gnome gpm gtk gtk2 imlib jack java javascript jpeg kde libg++ libwww mad mikmod motif mozilla mpeg ncurses nls oggvorbis opengl oss pam pdflib perl php png postgres python qt quicktime readline samba sdl slang spell sse ssl svga tcpd truetype x86 xml2 xmms xosd xv zlib" ----------------------------------------- My openmosix cluster consists of 4 nodes, PIV 2.6GHz HT and 1G RAM, all nodes are the same (gentoo instalation has been done by means of hdd replication). Application like stress-test works fine, load-balancing works fine. I didn't test this claster very hard but it seams only emerging/compilation crash. I didn't notice any other problem.
Created attachment 35813 [details] output form omtest (openmosix stress test) output form openmosix stress test (omtest)
can you show disassembled output of crashed function? # ulimit -c unlimited # emerge somethingBig after "Segmentation fault", you'll have generated core-dump # gdb -c core.pid segfaultedProgram (i bet, then segfaultedProgram will be python) # disasssemble
can not reproduce in .26-r5
I test kernels: 2.4.26-openmosix-r3 2.4.26-openmosix-r4 2.4.26-openmosix-r5 All kernels couses similar problems (described above). 2.4.26-openmosix-r5 additionaly freeze my computer! System hang up (only hard reset helps) during compilation - only the node on which emerging has been started. Different types of error appears (see above for three examples). It's not 100% repetitive. Sometimes emerging app finished with succes, sometimes fails. core inspection: #emerge screen (...) # cd /var/tmp/portage/screen-4.0.2/work/screen-4.0.2/ # gdb -c core.4273 GNU gdb 6.0 Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu". Core was generated by `make -j6'. Program terminated with signal 11, Segmentation fault. #0 0x4015bcc2 in ?? () -------------------------------------------- Unfortuantely I'm not familiar with program debuging :( Should I attache core file? or do something with it? Give me an instruction what should I do. Kwant!
Created attachment 36209 [details] configuration of kernels: 2.4.26-openmosix-r4 and 2.4.26-openmosix-r5 This is my kernel configuration on tested claster. Each node has the same kernel.
I don't know how familiar you are with openmosix, but openmosix is inted for use on ssi clusters. So when you have different architectures in your cluster you will likely see some problems. When you have i686 on your main node, but other (older cpus) as nodes when openmosix will migrate the cc calls to the older machines, which can't execute the code and will segfault of course.
As I have already mentioned, all nodes (4) has exactly the same hardware configuration. Libs version are the same too - I install only one gentoo and replicate it to the remaining 3 nodes. So... there are no problems with hard/soft compatibility. I've tested cluster with different kernels, but every time all kernels on each nodes are exactly (binary) the same. Last tested kernel cause additionaly problem with hanging up one node. I've no logs/core from this crash - this was serious hang up, only hard reset help (keybord wasn't responding, I couldn't ping this computer).