Both my machines have the same version of qt (3.2.3), gcc (3.2.3-r3) and glibc (2.3.2-r3). When emerging kde using distcc, the compilations fails: In file included from /usr/qt/3/include/qtoolbar.h:42, from /usr/kde/3.1/include/ktoolbar.h:27, from kuickshow.cpp:52: /usr/qt/3/include/qdockwindow.h:161:8: warning: null character(s) ignored In file included from /usr/qt/3/include/qtoolbar.h:42, from /usr/kde/3.1/include/ktoolbar.h:27, from kuickshow.cpp:52: /usr/qt/3/include/qdockwindow.h:161: syntax error before numeric constant /usr/qt/3/include/qdockwindow.h:161: stray '\300' in program /usr/qt/3/include/qdockwindow.h:161:16: warning: null character(s) ignored distcc[10915] ERROR: compile on pbienst failed make[3]: *** [kuickshow.lo] Error 1 make[3]: *** Waiting for unfinished jobs.... make[3]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4/kuickshow/src' qdockwindow.h looks perfectly normal and is the same on both machines. Turning off distcc fixes the problem. Reproducible: Always Steps to Reproduce: 1. 2. 3.
hi. please provide the information listed here: http://distcc.samba.org/problems.html also provide emerge info for the machines being used.
These things seem hard to reproduce exactly. I did another run, which finished without any problem, presumably because the offending file got compiled locally rather than remotely. I then removed 'localhost' from the distcc hosts, did another run, and now I get a different error, but also one involving stray characters. It almost looks as if the preprocessed source didn't get transmitted correcty. This is the end of the verbose log: distcc[22618] (dcc_note_state) note state 2, file "slideshowwidget.cpp", host "pbienst" distcc[22618] (dcc_connect_timed) nonblocking connect to 192.168.2.37:3632 distcc[22618] (dcc_select_for_write) select for write on fd4 distcc[22618] (dcc_connect_by_addr) client got connection to pbienst port 3632 on fd4 distcc[22618] (dcc_x_token_int) send DIST00000001 distcc[22618] (dcc_x_token_int) send ARGC00000018 distcc[22618] (dcc_x_token_int) send ARGV00000003 distcc[22618] (dcc_x_token_int) send ARGV00000012 distcc[22618] (dcc_x_token_int) send ARGV0000000e distcc[22618] (dcc_x_token_int) send ARGV00000007 distcc[22618] (dcc_x_token_int) send ARGV00000005 distcc[22618] (dcc_x_token_int) send ARGV00000009 distcc[22618] (dcc_x_token_int) send ARGV00000002 distcc[22618] (dcc_x_token_int) send ARGV0000000f distcc[22618] (dcc_x_token_int) send ARGV0000000f distcc[22618] (dcc_x_token_int) send ARGV00000005 distcc[22618] (dcc_x_token_int) send ARGV0000000c distcc[22618] (dcc_x_token_int) send ARGV0000000c distcc[22618] (dcc_x_token_int) send ARGV00000003 distcc[22618] (dcc_x_token_int) send ARGV00000003 distcc[22618] (dcc_x_token_int) send ARGV0000000f distcc[22618] (dcc_x_token_int) send ARGV0000000e distcc[22618] (dcc_x_token_int) send ARGV00000014 distcc[22618] (dcc_x_token_int) send ARGV00000005 distcc[22618] (dcc_x_token_int) send ARGV0000000f distcc[22618] (dcc_x_token_int) send ARGV0000000e distcc[22618] (dcc_x_token_int) send ARGV00000002 distcc[22618] (dcc_x_token_int) send ARGV00000013 distcc[22618] (dcc_x_token_int) send ARGV00000002 distcc[22618] (dcc_x_token_int) send ARGV00000011 distcc[22618] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)" distcc[22618] (dcc_collect_child) cpp child 22619 terminated with status 0 distcc[22618] (dcc_collect_child) cpp times: user 0.282000s, system 0.142000s, 35944 minflt, 707 majflt distcc[22618] cpp on localhost completed ok distcc[22618] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)" distcc[22618] (dcc_x_file) send 1123720 byte file /var/tmp/portage/kdegraphics-3.1.4/temp/distcc_7820e682.ii with token DOTI distcc[22618] (dcc_x_token_int) send DOTI00112588 distcc[22618] (dcc_send_job) client finished sending request to server distcc[22618] (dcc_note_state) note state 5, file "(NULL)", host "pbienst" distcc[22618] (dcc_r_token_int) got DONE00000001 distcc[22618] (dcc_r_result_header) got response header distcc[22618] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)" distcc[22618] (dcc_r_token_int) got STAT00000000 distcc[22618] (dcc_r_token_int) got SERR00000000 distcc[22618] (dcc_r_token_int) got SOUT00000000 distcc[22618] (dcc_r_token_int) got DOTO00004130 distcc[22618] (dcc_r_file) received 16688 bytes to file slideshowwidget.o distcc[22618] (dcc_r_file_timed) 16688 bytes received in 0.000852s, rate 19128kB/s distcc[22618] (dcc_unlock) release lock fd3 distcc[22618] compile on pbienst completed ok distcc[22618] elapsed compilation time 3.138399s distcc[22618] (dcc_exit) exit: code 0; self: 0.001000 user 0.011000 sys; children: 0.282000 user 0.142000 sys distcc[22618] (dcc_cleanup_tempfiles) deleted 1 temporary files /usr/qt/3/bin/moc ./printing.h -o printing.moc /bin/sh ../../libtool --silent --mode=compile --tag=CXX g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -c -o printing.lo `test -f 'printing.cpp' || echo './'`printing.cpp distcc[22771] (dcc_trace_version) distcc 2.11.1 i686-pc-linux-gnu; built Dec 21 2003 10:17:18 distcc[22771] (dcc_recursion_safeguard) safeguard level=0 distcc[22771] (dcc_set_path) setting PATH=/sbin:/usr/sbin:/usr/lib/portage/bin:/bin:/usr/bin:/usr/local/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.2:/usr/X11R6/bin:/opt/blackdown-jdk-1.4.1/bin:/opt/blackdown-jdk-1.4.1/jre/bin:/usr/qt/3/bin:/usr/kde/3.1/sbin:/usr/kde/3.1/bin distcc[22771] (dcc_scan_args) scanning arguments: g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -c printing.cpp -DPIC distcc[22771] (dcc_scan_args) found input file "printing.cpp" distcc[22771] (dcc_scan_args) no visible output file, going to add "-o printing.o" at end distcc[22771] compile from printing.cpp to printing.o distcc[22771] (dcc_get_hostlist) not reading /var/tmp/portage/kdegraphics-3.1.4/temp/fakehome/.distcc/hosts: No such file or directory distcc[22771] (dcc_parse_hosts_file) load hosts from /etc/distcc/hosts distcc[22771] (dcc_parse_hosts) found tcp token "pbienst" distcc[22771] (dcc_lock_host) got cpu lock on pbienst slot 0 as fd3 distcc[22771] (dcc_strip_dasho) result: g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -c printing.cpp -DPIC distcc[22771] (dcc_spawn_child) forking to execute: g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -E printing.cpp -DPIC distcc[22771] (dcc_spawn_child) child started as pid22772 distcc[22771] (dcc_strip_local_args) result: g++ -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -Wcast-align -Wconversion -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -c printing.cpp -o printing.o distcc[22771] exec on pbienst: g++ -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -Wcast-align -Wconversion -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -c printing.cpp -o printing.o distcc[22771] (dcc_note_state) note state 2, file "printing.cpp", host "pbienst" distcc[22771] (dcc_connect_timed) nonblocking connect to 192.168.2.37:3632 distcc[22771] (dcc_select_for_write) select for write on fd4 distcc[22772] (dcc_increment_safeguard) setting safeguard: _DISTCC_SAFEGUARD=1 distcc[22771] (dcc_connect_by_addr) client got connection to pbienst port 3632 on fd4 distcc[22771] (dcc_x_token_int) send DIST00000001 distcc[22771] (dcc_x_token_int) send ARGC00000018 distcc[22771] (dcc_x_token_int) send ARGV00000003 distcc[22771] (dcc_x_token_int) send ARGV00000012 distcc[22771] (dcc_x_token_int) send ARGV0000000e distcc[22771] (dcc_x_token_int) send ARGV00000007 distcc[22771] (dcc_x_token_int) send ARGV00000005 distcc[22771] (dcc_x_token_int) send ARGV00000009 distcc[22771] (dcc_x_token_int) send ARGV00000002 distcc[22771] (dcc_x_token_int) send ARGV0000000f distcc[22771] (dcc_x_token_int) send ARGV0000000f distcc[22771] (dcc_x_token_int) send ARGV00000005 distcc[22771] (dcc_x_token_int) send ARGV0000000c distcc[22771] (dcc_x_token_int) send ARGV0000000c distcc[22771] (dcc_x_token_int) send ARGV00000003 distcc[22771] (dcc_x_token_int) send ARGV00000003 distcc[22771] (dcc_x_token_int) send ARGV0000000f distcc[22771] (dcc_x_token_int) send ARGV0000000e distcc[22771] (dcc_x_token_int) send ARGV00000014 distcc[22771] (dcc_x_token_int) send ARGV00000005 distcc[22771] (dcc_x_token_int) send ARGV0000000f distcc[22771] (dcc_x_token_int) send ARGV0000000e distcc[22771] (dcc_x_token_int) send ARGV00000002 distcc[22771] (dcc_x_token_int) send ARGV0000000c distcc[22771] (dcc_x_token_int) send ARGV00000002 distcc[22771] (dcc_x_token_int) send ARGV0000000a distcc[22771] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)" distcc[22771] (dcc_collect_child) cpp child 22772 terminated with status 0 distcc[22771] (dcc_collect_child) cpp times: user 0.361000s, system 0.205000s, 49056 minflt, 839 majflt distcc[22771] cpp on localhost completed ok distcc[22771] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)" distcc[22771] (dcc_x_file) send 1430733 byte file /var/tmp/portage/kdegraphics-3.1.4/temp/distcc_c03ae686.ii with token DOTI distcc[22771] (dcc_x_token_int) send DOTI0015d4cd distcc[22771] (dcc_send_job) client finished sending request to server distcc[22771] (dcc_note_state) note state 5, file "(NULL)", host "pbienst" distcc[22771] (dcc_r_token_int) got DONE00000001 distcc[22771] (dcc_r_result_header) got response header distcc[22771] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)" distcc[22771] (dcc_r_token_int) got STAT00000100 distcc[22771] (dcc_r_token_int) got SERR00000e64 In file included from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/map:66, from /usr/qt/3/include/qmap.h:51, from /usr/qt/3/include/qmime.h:43, from /usr/qt/3/include/qevent.h:45, from /usr/qt/3/include/qobject.h:45, from /usr/qt/3/include/qwidget.h:43, from /usr/qt/3/include/qbutton.h:42, from /usr/qt/3/include/qcheckbox.h:42, from printing.cpp:19: /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h: In member function `std::_Rb_tree_iterator<_Val, _Val&, _Val*> std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare, _Alloc>::insert_equal(std::_Rb_tree_iterator<_Val, _Val&, _Val*>, const _Val&)': /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: stray '\10' in program /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: stray '\4' in program /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: stray '\220' in program /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: parse error before `@' token /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: syntax error before `.' token In file included from printing.cpp:27: /usr/qt/3/include/qpainter.h:650:45: warning: null character(s) ignored In file included from printing.cpp:27: /usr/qt/3/include/qpainter.h: At global scope: /usr/qt/3/include/qpainter.h:650: stray '\262' in program /usr/qt/3/include/qpainter.h:651: parse error before `@' token /usr/qt/3/include/qpainter.h:651:4: warning: null character(s) ignored /usr/qt/3/include/qpainter.h:651: syntax error before `&' token /usr/qt/3/include/qpainter.h:653: prototype for `void QPainter::drawTiledPixmap(...)' does not match any in class `QPainter' /usr/qt/3/include/qpainter.h:231: candidates are: void QPainter::drawTiledPixmap(const QRect&, const QPixmap&) /usr/qt/3/include/qpainter.h:230: void QPainter::drawTiledPixmap(const QRect&, const QPixmap&, const QPoint&) /usr/qt/3/include/qpainter.h:228: void QPainter::drawTiledPixmap(int, int, int, int, const QPixmap&, int = 0, int = 0) /usr/qt/3/include/qpainter.h: In member function `void QPainter::drawTiledPixmap(...)': /usr/qt/3/include/qpainter.h:654: `r' undeclared (first use this function) /usr/qt/3/include/qpainter.h:654: (Each undeclared identifier is reported only once for each function it appears in.) /usr/qt/3/include/qpainter.h:654: `pm' undeclared (first use this function) /usr/qt/3/include/qpainter.h:654: `sp' undeclared (first use this function) In file included from /usr/include/Imlib_types.h:1, from /usr/include/Imlib.h:4, from imlibwidget.h:33, from imagewindow.h:26, from printing.cpp:40: /usr/X11R6/include/X11/Xlib.h: At global scope: /usr/X11R6/include/X11/Xlib.h:1298: stray '\240' in program /usr/X11R6/include/X11/Xlib.h:1298: stray '\231' in program /usr/X11R6/include/X11/Xlib.h:1298: stray '\231' in program /usr/X11R6/include/X11/Xlib.h:1298: parse error before `@' token /usr/X11R6/include/X11/Xlib.h:1298: syntax error before `encoding_is_wchar' printing.cpp: In member function `void KuickPrintDialogPage::setScaleWidth(int)': printing.cpp:302: warning: passing `float' for argument 1 of `void KIntNumInput::setValue(int)' printing.cpp: In member function `void KuickPrintDialogPage::setScaleHeight(int)': printing.cpp:307: warning: passing `float' for argument 1 of `void KIntNumInput::setValue(int)' distcc[22771] (dcc_r_token_int) got SOUT00000000 distcc[22771] (dcc_r_token_int) got DOTO00000000 distcc[22771] (dcc_unlock) release lock fd3 distcc[22771] ERROR: compile on pbienst failed distcc[22771] elapsed compilation time 4.195878s distcc[22771] (dcc_exit) exit: code 1; self: 0.004000 user 0.008000 sys; children: 0.361000 user 0.205000 sys distcc[22771] (dcc_cleanup_tempfiles) deleted 1 temporary files make[3]: *** [printing.lo] Error 1 make[3]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4/kuickshow/src' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4/kuickshow' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4' make: *** [all] Error 2 emerge info for the remote host: Portage 2.0.49-r15 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r3, 2.4.22-gentoo-test-r1) ================================================================= System uname: 2.4.22-gentoo-test-r1 i686 Intel(R) Pentium(R) 4 CPU 2.40GHz Gentoo Base System version 1.4.3.10p1 distcc 2.11.1 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-march=pentium4 -O3 -funroll-loops -fomit-frame-pointer -pipe" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/cvs/share/config /usr/kde/3.1/share/config /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/config" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-march=pentium4 -O3 -funroll-loops -fomit-frame-pointer -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="sandbox ccache autoaddcvs distcc" GENTOO_MIRRORS="ftp://ftp.snt.utwente.nl/pub/os/linux/gentoo/ " MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 oss avi foomaticdb gif gtk2 libg++ mad mikmod nls gtkhtml gdbm berkdb slang arts aalib bonobo guile tcpd libwww esd -ldap mysql qt usb p44da motif alsa acpi atlas apm cdr crypt cups dga directfb dvd encode gphoto2 flash gpm qtmt imap imlib java jpeg jikes kde mpeg mmx ncurses opengl oggvorbis pam pda pdflib perl plotutils pic png pnp python quicktime readline samba sdl spell sse ssl tcltk svga tetex tiff truetype wmf xml xml2 xmms xv zlib X gtk gnome" energe info for the local host: Portage 2.0.49-r15 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r3, 2.4.20-gentoo-r9) ================================================================= System uname: 2.4.20-gentoo-r9 i686 Intel(R) Pentium(R) M processor 1000MHz Gentoo Base System version 1.4.3.10 distcc[22896] (dcc_trace_version) distcc 2.11.1 i686-pc-linux-gnu; built Dec 21 2003 10:17:18 [enabled] ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CFLAGS="-O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/config" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="sandbox ccache autoaddcvs distcc" GENTOO_MIRRORS="http://212.219.247.15/sites/www.ibiblio.org/gentoo/ http://212.219.247.18/sites/www.ibiblio.org/gentoo/ http://212.219.247.11/sites/www.ibiblio.org/gentoo/ http://212.219.247.12/sites/www.ibiblio.org/gentoo/" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 oss apm foomaticdb gpm gtk2 libg++ mad mikmod ncurses nls png gdbm berkdb slang svga sdl tcpd libwww perl gtk motif X qt kde -gnome acpi alsa arts atlas avi bidi cdr crypt cups dga directfb dvd encode emacs ethereal gif gphoto2 guile imap imlib jpeg java ldap lirc mmx mpeg mpi mysql oggvorbis opengl pam pda pcmcia pdflib plotutils pnp python quicktime readline samba spell sse ssl tcltk tetex tiff truetype unicode usb videos wmf zlib xv xml2 xmms" distcc version is 2.11.1
Somebody else seems to have similar problems. Quoting from this forum thread: http://forums.gentoo.org/viewtopic.php?t=117112&highlight=distcc [quote] I'm encountering similar problems using gcc (GCC) 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r2, propolice), ccache 2.3 and distcc-2.11.2-r1 on a setup with 2 athlons. The most remarkable thing though is that it appears that the erronous file is being sent twice to compile on the remote host: Dec 24 17:49:11 athlonia distccd[18206]: (dcc_set_input) changed input from "/tmp/.ccache/fakes.tmp.bluebird.14705.i" to "/tmp/distccd_ee93c387.i" Dec 24 17:49:12 athlonia distccd[18204]: (dcc_set_input) changed input from "fakes.c" to "/tmp/distccd_e808c387.i" And when i diff those files i get athlonia tmp # diff distccd_e808c387.i distccd_ee93c387.i 469c469 < struct __sched_param __schedparGe
Somebody else seems to have similar problems. Quoting from this forum thread: http://forums.gentoo.org/viewtopic.php?t=117112&highlight=distcc [quote] I'm encountering similar problems using gcc (GCC) 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r2, propolice), ccache 2.3 and distcc-2.11.2-r1 on a setup with 2 athlons. The most remarkable thing though is that it appears that the erronous file is being sent twice to compile on the remote host: Dec 24 17:49:11 athlonia distccd[18206]: (dcc_set_input) changed input from "/tmp/.ccache/fakes.tmp.bluebird.14705.i" to "/tmp/distccd_ee93c387.i" Dec 24 17:49:12 athlonia distccd[18204]: (dcc_set_input) changed input from "fakes.c" to "/tmp/distccd_e808c387.i" And when i diff those files i get athlonia tmp # diff distccd_e808c387.i distccd_ee93c387.i 469c469 < struct __sched_param __schedparGeÀÆót __inheritsched; --- > struct __sched_param __schedpart __inheritsched; And it shouldn't be a mystery what's wrong with that. I'll be locking myself up over xmas to see if its ccache or distcc or cosmic rays. [/quote]
Why is it that KDE stuff fails with Distcc? Martin, is this a Distcc-related quirk?
I've got a nice reproducable non-kde testcase: samba-3.0.1-r1 Using 2 machines, bluebird and athlonia, doing an 'emerge samba' on bluebird with 'distcc-config --set-hosts' at athlonia/5 it always stops on the same place with the same stray tokens. Using distcc 2.12 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) on both machines, From distcc.log on bluebird: distcc[10053] (dcc_spawn_child) child started as pid10058 distcc[10053] (dcc_strip_local_args) result: gcc -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o distcc[10053] exec on athlonia/5: gcc -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o distcc[10053] (dcc_note_state) note state 2, file "snprintf.c", host "athlonia" distcc[10053] (dcc_connect_timed) nonblocking connect to 10.0.0.96:3632 distcc[10053] (dcc_select_for_write) select for write on fd8 distcc[10053] (dcc_connect_by_addr) client got connection to athlonia port 3632 on fd8 distcc[10053] (dcc_x_token_int) send DIST00000001 distcc[10053] (dcc_x_token_int) send ARGC00000012 distcc[10053] (dcc_x_token_int) send ARGV00000003 distcc[10053] (dcc_x_token_int) send ARGV00000013 distcc[10053] (dcc_x_token_int) send ARGV00000003 distcc[10053] (dcc_x_token_int) send ARGV00000005 distcc[10053] (dcc_x_token_int) send ARGV00000005 distcc[10053] (dcc_x_token_int) send ARGV00000007 distcc[10053] (dcc_x_token_int) send ARGV00000014 distcc[10053] (dcc_x_token_int) send ARGV0000000b distcc[10053] (dcc_x_token_int) send ARGV0000000e distcc[10053] (dcc_x_token_int) send ARGV0000000c distcc[10053] (dcc_x_token_int) send ARGV00000013 distcc[10053] (dcc_x_token_int) send ARGV0000001a distcc[10053] (dcc_x_token_int) send ARGV0000000a distcc[10053] (dcc_x_token_int) send ARGV00000005 distcc[10053] (dcc_x_token_int) send ARGV00000002 distcc[10053] (dcc_x_token_int) send ARGV0000000e distcc[10053] (dcc_x_token_int) send ARGV00000002 distcc[10053] (dcc_x_token_int) send ARGV0000000e distcc[10053] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)" distcc[10053] (dcc_collect_child) cpp child 10058 terminated with status 0 distcc[10053] (dcc_collect_child) cpp times: user 0.117982s, system 0.076988s, 6159 minflt, 593 majflt distcc[10053] cpp on localhost completed ok distcc[10053] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)" distcc[10053] (dcc_x_file) send 62615 byte file /var/tmp/portage/samba-3.0.1-r1/temp/distcc_cbdfe644.i with token DOTI distcc[10053] (dcc_x_token_int) send DOTI0000f497 distcc[10053] (dcc_send_job) client finished sending request to server distcc[10053] (dcc_note_state) note state 5, file "(NULL)", host "athlonia" distcc[10053] (dcc_r_token_int) got DONE00000001 distcc[10053] (dcc_r_result_header) got response header distcc[10053] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)" distcc[10053] (dcc_r_token_int) got STAT00000100 distcc[10053] (dcc_r_token_int) got SERR00000a05 distcc[10053] (dcc_r_token_int) got SOUT00000000 distcc[10053] (dcc_r_token_int) got DOTO00000000 distcc[10053] (dcc_unlock) release lock fd7 distcc[10053] ERROR: compile on athlonia/5 failed distcc[10053] elapsed compilation time 0.364662s distcc[10053] (dcc_exit) exit: code 1; self: 0.010998 user 0.011998 sys; children: 0.258960 user 0.146977 sys distcc[10053] (dcc_cleanup_tempfiles) deleted 1 temporary files Console output from 'emerge samba' on bluebird: Compiling lib/bitmap.c Compiling lib/crc32.c Compiling lib/snprintf.c In file included from /usr/include/string.h:375, from lib/snprintf.c:113: /usr/include/bits/string2.h:1002: error: stray '\252' in program /usr/include/bits/string2.h:1002: error: syntax error before "o" /usr/include/bits/string2.h:1002: error: stray '\341' in program /usr/include/bits/string2.h:1002: error: stray '\377' in program /usr/include/bits/string2.h:1002: error: stray '\312' in program /usr/include/bits/string2.h:1002: error: stray '\373' in program /usr/include/bits/string2.h: In function `__strspn_c3': /usr/include/bits/string2.h:1003: error: number of arguments doesn't match proto type /usr/include/bits/string2.h:1000: error: prototype declaration /usr/include/bits/string2.h:1006: error: `__s' undeclared (first use in this fun ction) /usr/include/bits/string2.h:1006: error: (Each undeclared identifier is reported only once /usr/include/bits/string2.h:1006: error: for each function it appears in.) /usr/include/bits/string2.h:1006: error: `__accept1' undeclared (first use in th is function) /usr/include/bits/string2.h:1007: error: `__accept2' undeclared (first use in th is function) /usr/include/bits/string2.h:1007: error: `__accept3' undeclared (first use in th is function) In file included from /usr/include/string.h:375, from lib/snprintf.c:113: /usr/include/bits/string2.h: At top level: /usr/include/bits/string2.h:1235: error: syntax error before "siz" /usr/include/bits/string2.h:1235: error: stray '\252' in program /usr/include/bits/string2.h:1235: error: stray '\341' in program /usr/include/bits/string2.h:1235: error: stray '\377' in program /usr/include/bits/string2.h:1235: error: stray '\312' in program /usr/include/bits/string2.h:1235: error: stray '\373' in program In file included from lib/snprintf.c:122: /usr/include/sys/types.h:135: error: stray '\252' in program /usr/include/sys/types.h:135: error: stray '\341' in program /usr/include/sys/types.h:135: error: stray '\377' in program /usr/include/sys/types.h:135: error: syntax error before "P" /usr/include/sys/types.h:135: error: stray '\312' in program /usr/include/sys/types.h:135: error: stray '\373' in program /usr/include/sys/types.h:135: error: syntax error before "__useconds_t" line-map.c: file "/usr/include/bits/pthreadtypes.h" left but not entered In file included from lib/snprintf.c:125: /usr/include/stdlib.h:166: error: stray '\252' in program /usr/include/stdlib.h:166: error: syntax error before "o" /usr/include/stdlib.h:166: error: stray '\341' in program /usr/include/stdlib.h:166: error: stray '\377' in program /usr/include/stdlib.h:166: error: stray '\312' in program /usr/include/stdlib.h:166: error: stray '\373' in program /usr/include/stdlib.h:314: error: stray '\252' in program /usr/include/stdlib.h:314: error: stray '\341' in program /usr/include/stdlib.h:314: error: stray '\377' in program /usr/include/stdlib.h:314: error: syntax error before "P" /usr/include/stdlib.h:314: error: stray '\312' in program /usr/include/stdlib.h:314: error: stray '\373' in program /usr/include/stdlib.h: In function `strtol': /usr/include/stdlib.h:316: error: number of arguments doesn't match prototype /usr/include/stdlib.h:177: error: prototype declaration /usr/include/stdlib.h:317: error: `__nptr' undeclared (first use in this functio n) /usr/include/stdlib.h:317: error: `__endptr' undeclared (first use in this funct ion) /usr/include/stdlib.h:317: error: `__base' undeclared (first use in this functio n) /usr/include/stdlib.h: At top level: /usr/include/stdlib.h:526: error: stray '\252' in program /usr/include/stdlib.h:526: error: syntax error before "o" /usr/include/stdlib.h:526: error: stray '\341' in program /usr/include/stdlib.h:526: error: stray '\377' in program /usr/include/stdlib.h:526: error: stray '\312' in program /usr/include/stdlib.h:526: error: stray '\373' in program In file included from lib/snprintf.c:125: /usr/include/stdlib.h:803: error: stray '\252' in program /usr/include/stdlib.h:803: error: syntax error before "o" /usr/include/stdlib.h:803: error: stray '\341' in program /usr/include/stdlib.h:803: error: stray '\377' in program /usr/include/stdlib.h:803: error: stray '\312' in program /usr/include/stdlib.h:803: error: stray '\373' in program In file included from /usr/include/_G_config.h:44, from /usr/include/libio.h:32, from /usr/include/stdio.h:72, from lib/snprintf.c:130: /usr/include/gconv.h:66: error: stray '\252' in program /usr/include/gconv.h:66: error: stray '\341' in program /usr/include/gconv.h:66: error: stray '\377' in program /usr/include/gconv.h:66: error: syntax error before "P" /usr/include/gconv.h:66: error: stray '\312' in program /usr/include/gconv.h:66: error: stray '\373' in program In file included from /usr/include/stdio.h:72, from lib/snprintf.c:130: /usr/include/libio.h:351: error: stray '\252' in program /usr/include/libio.h:351: error: syntax error before "o" /usr/include/libio.h:351: error: stray '\341' in program /usr/include/libio.h:351: error: stray '\377' in program /usr/include/libio.h:351: error: stray '\312' in program /usr/include/libio.h:351: error: stray '\373' in program /usr/include/libio.h:376: error: syntax error before "cookie_read_function_t" /usr/include/libio.h:384: error: syntax error before "__io_read_fn" /usr/include/libio.h:388: error: syntax error before '}' token /usr/include/libio.h:389: error: syntax error before "cookie_io_functions_t" /usr/include/libio.h:395: error: syntax error before "_IO_cookie_io_functions_t" In file included from lib/snprintf.c:130: /usr/include/stdio.h:281: error: syntax error before "_IO_cookie_io_functions_t" /usr/include/stdio.h:290: error: stray '\252' in program /usr/include/stdio.h:290: error: syntax error before "o" /usr/include/stdio.h:290: error: stray '\341' in program /usr/include/stdio.h:290: error: stray '\377' in program /usr/include/stdio.h:290: error: stray '\312' in program /usr/include/stdio.h:290: error: stray '\373' in program /usr/include/stdio.h:560: error: stray '\252' in program /usr/include/stdio.h:560: error: syntax error before "o" /usr/include/stdio.h:560: error: stray '\341' in program /usr/include/stdio.h:560: error: stray '\377' in program /usr/include/stdio.h:560: error: stray '\312' in program /usr/include/stdio.h:560: error: stray '\373' in program make: *** [lib/snprintf.o] Error 1 !!! ERROR: net-fs/samba-3.0.1-r1 failed. !!! Function src_compile, Line 169, Exitcode 2 !!! SAMBA pieces The 2 'emerge info' 's: athlonia root # emerge info /tmp/.packages Portage 2.0.50_pre9 (default-x86-1.4, gcc-3.3.2, glibc-2.3.3_pre20031212-r0, 2.4 .22-gentoo-r1) ================================================================= System uname: 2.4.22-gentoo-r1 i686 AMD-K7(tm) Processor Gentoo Base System version 1.4.3.12 distcc 2.12 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] Autoconf: sys-devel/autoconf-2.58 Automake: sys-devel/automake-1.7.8 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CFLAGS="-march=athlon -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math - funroll-loops -fforce-addr -falign-functions=4" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/s hare/config /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-march=athlon -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4" DISTDIR="/tmp/distfiles" FEATURES="autoaddcvs ccache distcc sandbox userpriv usersandbox" GENTOO_MIRRORS="http://ftp.snt.utwente.nl/pub/os/linux/gentoo ftp://ftp.snt.utwe nte.nl/pub/os/linux/gentoo" MAKEOPTS="-j1" PKGDIR="/tmp/.packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="3dnow X X509 amd apache2 autofs berkdb crypt cscope curl ethereal gd gdbm i mlib ipv6 jpeg ldap libg++ libwww md5sum memlimit mmx mpi mysql ncurses nocstrik e nodod nojoystick noqmax notfc objc offensive oggvorbis opengl openssh pam parse-clocks passfile pcap pcmcia pda pdflib perl php physfs pic plotutils png python readline ruby sasl sdl skey slang slp snmp spell sse ssl tcltk tcpd tiff truetype usb vim-with-x wmf x86 xml xml2 zlib" bluebird root # emerge info /usr/portage/packages Portage 2.0.50_pre9 (default-x86-1.4, gcc-3.3.2, glibc-2.3.3_pre20031212-r0, 2.6.0-gentoo) ================================================================= System uname: 2.6.0-gentoo i686 AMD Athlon(tm) processor Gentoo Base System version 1.4.3.12 distcc 2.12 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled] ccache version 2.3 [enabled] Autoconf: sys-devel/autoconf-2.58 Automake: sys-devel/automake-1.7.8 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="no" CFLAGS="-march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args" CHOST="i686-pc-linux-gnu" COMPILER="gcc3" CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d" CXXFLAGS="-march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args" DISTDIR="/tmp/distfiles" FEATURES="autoaddcvs buildpkg ccache distcc sandbox userpriv usersandbox" GENTOO_MIRRORS="http://www.mirror.ac.uk/sites/www.ibiblio.org/gentoo/" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.nl.gentoo.org/gentoo-portage" USE="3dnow X X509 aalib amd apache2 apm arts artswrappersuid autofs avi berkdb cdr crypt cscope cups curl dga doc encode ethereal faad ffmpeg foomaticdb foreign-sysvinit gd gdbm ggi gif ginac glut gmtfull gmthigh gmtsuppl gmttria gtk gtk2 imap imlib ipv6 jabber java javascript jikes jpeg junit kde ldap libg++ libwww mad md5sum memlimit mmx motif mozilla moznocompose moznoirc moznomail mpeg mpi msn mysql ncurses nocstrike nodod nojoystick noqmax notfc nvidia nviz objc offensive oggvorbis opengl openssh oscar oss pam parse-clocks passfile pcap pcmcia pda pdflib perl php physfs pic plotutils png python qhull qt quicktime readline ruby samba sasl sdk sdl skey slang slp snmp spell sse ssl tcltk tcpd tetex tiff truetype usb v4l vim-with-x wmf x86 xchatnogtk xinerama xml xml2 xmms xosd xv xvid zeo zlib zvbi"
If anyone can reproduce this: Please set DISTCC_SAVE_TEMPS on both client and server, and post the temporary files corresponding to the run that failed. Please post the server log messages as well as the client ones. If you could also include a tcpdump that would be helpful. Can you reproduce the failures using a kernel.org kernel? Is everyone seeing this bug using the propolice patches? Thanks
Some additional info: my gcc install is 3.2.3-r3 and thus has the propolice patch, but I haven't used the -fstack-protector option anywhere. My client kernel is a stock 2.6.0, server kernel is 2.4.22-gentoo-test-r1
Created attachment 23153 [details] tcpdump from compilehost
Created attachment 23154 [details] source on receiver
Created attachment 23155 [details] source on sender
Same setup, emerging on bluebird, sending jobs to athlonia: Log from bluebird: distcc[27816] (dcc_trace_version) distcc 2.12 i686-pc-linux-gnu; built Dec 29 2003 15:47:39 distcc[27816] (dcc_recursion_safeguard) safeguard level=0 distcc[27816] (main) compiler name is "gcc" distcc[27816] (dcc_set_path) setting PATH=/usr/bin:/usr/sbin:/usr/local/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.3:/opt/Acrobat5:/usr/X11R6/bin:/opt/blackdown-jdk-1.4.1/bin:/opt/blackdown-jdk-1.4.1/jre/bin:/usr/qt/3/bin:/usr/kde/3.2/sbin:/usr/kde/3.2/bin:/opt/vmware/bin distcc[27816] (dcc_scan_args) scanning arguments: gcc -qlanglvl=ansi -I. -I. -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -I/usr/include/mysql -mcpu=i686 -pipe -DHAVE_ERRNO_AS_DEFINE=1 -DUSE_OLD_FUNCTIONS -I/usr/include/libxml2 -Iinclude -I./include -I./ubiqx -I./smbwrapper -I. -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I. -c lib/snprintf.c -o lib/snprintf.o distcc[27816] (dcc_scan_args) found input file "lib/snprintf.c" distcc[27816] (dcc_scan_args) found object/output file "lib/snprintf.o" distcc[27816] compile from snprintf.c to snprintf.o distcc[27816] (dcc_get_hostlist) not reading /root/.distcc/hosts: No such file or directory distcc[27816] (dcc_parse_hosts_file) load hosts from /etc/distcc/hosts distcc[27816] (dcc_parse_hosts) found tcp token "athlonia/5" distcc[27816] (dcc_lock_host) got cpu lock on athlonia/5 slot 0 as fd4 distcc[27816] (dcc_strip_dasho) result: gcc -qlanglvl=ansi -I. -I. -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -I/usr/include/mysql -mcpu=i686 -pipe -DHAVE_ERRNO_AS_DEFINE=1 -DUSE_OLD_FUNCTIONS -I/usr/include/libxml2 -Iinclude -I./include -I./ubiqx -I./smbwrapper -I. -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I. -c lib/snprintf.c distcc[27816] (dcc_spawn_child) forking to execute: gcc -qlanglvl=ansi -I. -I. -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -I/usr/include/mysql -mcpu=i686 -pipe -DHAVE_ERRNO_AS_DEFINE=1 -DUSE_OLD_FUNCTIONS -I/usr/include/libxml2 -Iinclude -I./include -I./ubiqx -I./smbwrapper -I. -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I. -E lib/snprintf.c distcc[27817] (dcc_increment_safeguard) setting safeguard: _DISTCC_SAFEGUARD=1 distcc[27816] (dcc_spawn_child) child started as pid27817 distcc[27816] (dcc_strip_local_args) result: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o distcc[27816] exec on athlonia/5: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o distcc[27816] (dcc_note_state) note state 2, file "snprintf.c", host "athlonia" distcc[27816] (dcc_connect_timed) nonblocking connect to 10.0.0.96:3632 distcc[27816] (dcc_select_for_write) select for write on fd5 distcc[27816] (dcc_connect_by_addr) client got connection to athlonia port 3632 on fd5 distcc[27816] (dcc_x_token_int) send DIST00000001 distcc[27816] (dcc_x_token_int) send ARGC00000013 distcc[27816] (dcc_x_token_int) send ARGV00000003 distcc[27816] (dcc_x_token_int) send ARGV0000000e distcc[27816] (dcc_x_token_int) send ARGV00000013 distcc[27816] (dcc_x_token_int) send ARGV00000003 distcc[27816] (dcc_x_token_int) send ARGV00000005 distcc[27816] (dcc_x_token_int) send ARGV00000005 distcc[27816] (dcc_x_token_int) send ARGV00000007 distcc[27816] (dcc_x_token_int) send ARGV00000014 distcc[27816] (dcc_x_token_int) send ARGV0000000b distcc[27816] (dcc_x_token_int) send ARGV0000000e distcc[27816] (dcc_x_token_int) send ARGV0000000c distcc[27816] (dcc_x_token_int) send ARGV00000013 distcc[27816] (dcc_x_token_int) send ARGV0000001a distcc[27816] (dcc_x_token_int) send ARGV0000000a distcc[27816] (dcc_x_token_int) send ARGV00000005 distcc[27816] (dcc_x_token_int) send ARGV00000002 distcc[27816] (dcc_x_token_int) send ARGV0000000e distcc[27816] (dcc_x_token_int) send ARGV00000002 distcc[27816] (dcc_x_token_int) send ARGV0000000e distcc[27816] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)" distcc[27816] (dcc_collect_child) cpp child 27817 terminated with status 0 distcc[27816] (dcc_collect_child) cpp times: user 0.061990s, system 0.010998s, 254 minflt, 545 majflt distcc[27816] cpp on localhost completed ok distcc[27816] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)" distcc[27816] (dcc_x_file) send 62615 byte file /tmp/distcc_bb20772f.i with token DOTI distcc[27816] (dcc_x_token_int) send DOTI0000f497 distcc[27816] (dcc_send_job) client finished sending request to server distcc[27816] (dcc_note_state) note state 5, file "(NULL)", host "athlonia" distcc[27816] (dcc_r_token_int) got DONE00000001 distcc[27816] (dcc_r_result_header) got response header distcc[27816] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)" distcc[27816] (dcc_r_token_int) got STAT00000100 distcc[27816] (dcc_r_token_int) got SERR0000256c distcc[27816] (dcc_r_token_int) got SOUT00000000 distcc[27816] (dcc_r_token_int) got DOTO00000000 distcc[27816] (dcc_unlock) release lock fd4 distcc[27816] ERROR: compile on athlonia/5 failed distcc[27816] elapsed compilation time 0.303556s distcc[27816] (dcc_exit) exit: code 1; self: 0.002999 user 0.002999 sys; children: 0.061990 user 0.010998 sys Log from athlonia: distccd[18752] (dcc_check_client) connection from 192.168.1.4:60714 distccd[18752] (dcc_r_token_int) got DIST00000001 distccd[18752] (dcc_r_token_int) got ARGC00000013 distccd[18752] (dcc_r_argv) reading 19 arguments from job submission distccd[18752] (dcc_r_token_int) got ARGV00000003 distccd[18752] (dcc_r_argv) argv[0] = "gcc" distccd[18752] (dcc_r_token_int) got ARGV0000000e distccd[18752] (dcc_r_argv) argv[1] = "-qlanglvl=ansi" distccd[18752] (dcc_r_token_int) got ARGV00000013 distccd[18752] (dcc_r_argv) argv[2] = "-march=athlon-tbird" distccd[18752] (dcc_r_token_int) got ARGV00000003 distccd[18752] (dcc_r_argv) argv[3] = "-O3" distccd[18752] (dcc_r_token_int) got ARGV00000005 distccd[18752] (dcc_r_argv) argv[4] = "-pipe" distccd[18752] (dcc_r_token_int) got ARGV00000005 distccd[18752] (dcc_r_argv) argv[5] = "-mmmx" distccd[18752] (dcc_r_token_int) got ARGV00000007 distccd[18752] (dcc_r_argv) argv[6] = "-m3dnow" distccd[18752] (dcc_r_token_int) got ARGV00000014 distccd[18752] (dcc_r_argv) argv[7] = "-fomit-frame-pointer" distccd[18752] (dcc_r_token_int) got ARGV0000000b distccd[18752] (dcc_r_argv) argv[8] = "-ffast-math" distccd[18752] (dcc_r_token_int) got ARGV0000000e distccd[18752] (dcc_r_argv) argv[9] = "-funroll-loops" distccd[18752] (dcc_r_token_int) got ARGV0000000c distccd[18752] (dcc_r_argv) argv[10] = "-fforce-addr" distccd[18752] (dcc_r_token_int) got ARGV00000013 distccd[18752] (dcc_r_argv) argv[11] = "-falign-functions=4" distccd[18752] (dcc_r_token_int) got ARGV0000001a distccd[18752] (dcc_r_argv) argv[12] = "-maccumulate-outgoing-args" distccd[18752] (dcc_r_token_int) got ARGV0000000a distccd[18752] (dcc_r_argv) argv[13] = "-mcpu=i686" distccd[18752] (dcc_r_token_int) got ARGV00000005 distccd[18752] (dcc_r_argv) argv[14] = "-pipe" distccd[18752] (dcc_r_token_int) got ARGV00000002 distccd[18752] (dcc_r_argv) argv[15] = "-c" distccd[18752] (dcc_r_token_int) got ARGV0000000e distccd[18752] (dcc_r_argv) argv[16] = "lib/snprintf.c" distccd[18752] (dcc_r_token_int) got ARGV00000002 distccd[18752] (dcc_r_argv) argv[17] = "-o" distccd[18752] (dcc_r_token_int) got ARGV0000000e distccd[18752] (dcc_r_argv) argv[18] = "lib/snprintf.o" distccd[18752] (dcc_r_argv) got arguments: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o distccd[18752] (dcc_scan_args) scanning arguments: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o distccd[18752] (dcc_scan_args) found input file "lib/snprintf.c" distccd[18752] (dcc_scan_args) found object/output file "lib/snprintf.o" distccd[18752] compile from snprintf.c to snprintf.o distccd[18752] (dcc_run_job) output file lib/snprintf.o distccd[18752] (dcc_input_tmpnam) input file lib/snprintf.c distccd[18752] (dcc_r_token_int) got DOTI0000f497 distccd[18752] (dcc_r_file) received 62615 bytes to file /tmp/distccd_254e8503.i distccd[18752] (dcc_r_file_timed) 62615 bytes received in 0.061983s, rate 987kB/s distccd[18752] (dcc_set_input) changed input from "lib/snprintf.c" to "/tmp/distccd_254e8503.i" distccd[18752] (dcc_set_input) command after: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c /tmp/distccd_254e8503.i -o lib/snprintf.o distccd[18752] (dcc_set_output) changed output from "lib/snprintf.o" to "/tmp/distccd_22ce8503.o" distccd[18752] (dcc_set_output) command after: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c /tmp/distccd_254e8503.i -o /tmp/distccd_22ce8503.o distccd[18752] (dcc_check_compiler_masq) Warning: gcc on distccd's path is /usr/lib/distcc/bin/gcc and really a link to /usr/bin/distcc distccd[18752] (dcc_spawn_child) forking to execute: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c /tmp/distccd_254e8503.i -o /tmp/distccd_22ce8503.o distccd[18752] (dcc_spawn_child) child started as pid18794 distccd[18794] (dcc_increment_safeguard) setting safeguard: _DISTCC_SAFEGUARD=1 distccd[18752] (dcc_collect_child) cc child 18794 terminated with status 0x100 distccd[18752] (dcc_collect_child) cc times: user 0.085000s, system 0.010000s, 687 minflt, 1005 majflt distccd[18752] (dcc_x_token_int) send DONE00000001 distccd[18752] (dcc_x_token_int) send STAT00000100 distccd[18752] (dcc_x_file) send 9580 byte file /tmp/distcc_a07a8503.stderr with token SERR distccd[18752] (dcc_x_token_int) send SERR0000256c distccd[18752] (dcc_x_file) send 0 byte file /tmp/distcc_a1de8503.stdout with token SOUT distccd[18752] (dcc_x_token_int) send SOUT00000000 distccd[18752] (dcc_x_token_int) send DOTO00000000 distccd[18752] gcc on localhost failed distccd[18752] job complete
Your attachment 23155 [details] is just 22 bytes being "/tmp/distcc_bb20772f.i". Is that *really* what was in the client's temporary file?
Thanks for getting the tcpdump. Attachment 23153 [details] seems to show that from the point of view of the client's TCP stack, the data was sent out correctly. From attachment:23154 we can see that the data was corrupt in the server's temporary file. You don't need to worry about getting the client's tmpfile because the data was correct when it left the client. It seems like the remaining possibilities are: - the data was corrupted in transit and not detected by the TCP stack - the data was corrupted by the server's kernel - distccd made a mistake when receiving the data - distccd scribbled over the data in the temporary file at some later point It would be helpful to get a tcpdump recorded on the server, so that we can see whether it is coming in to that machine correctly.
A few more observations from comparing the tcpdump and the server's temporary file: All of the errors in this report seem to be substitutions, rather than insertions or deletions. In other words all the correct bytes are at the same offset. All of the error runs are 8 bytes, which would be inconsistent with some software accidentally writing a uint64 or two ints into the buffer. That might conceivably be either distcc or the kernel. Also, there is a common pattern to the bytes that are written in, which perhaps supports the idea that they indicate the buffer being overwritten by some other value. 80 79 6d d7 c7 c6 6e a2 80 79 6d d7 c7 c6 6e a2 80 79 6d d7 c7 c6 6e a2 In fact in the dump we have here that seems to be the same value for every error. That pattern doesn't look familiar to me. What is your networking setup? Are you using any kind of iptables or similar packetmangling software? The errors are at offsets within the .i file of 3085 7181 11277 15373 19469 23565 27661 So they start at a strange offset, but are evenly spaced every 4096 bytes! Counting within the client's TCP stream, the first offset is 3554 which does not seem very meaningful either.
Let's see what code in distccd could be causing this: distccd is not using mmap to receive the file, because it's less than 64kB. It is not compressed. Therefore it goes through the alternative path in dcc_r_bulk_plain, which basically reads the whole thing into a big mallocd buffer and then writes it out again. I don't see any buffer sizes or steps of 4kB, except that of course that is the kernel page size.
Sorry for the missing/22byte file from the client, I should have checked if it actually uploaded the file. The tcpdump was done on the server (10.0.0.96), so I assume the transmission over the wire was succesful and my network setup hasn't effected it.
Created attachment 23170 [details] source on sender
OK, some things to do: Could you try running the daemon under valgrind? Could you get a tcpdump recorded on the server machine? Thanks for your help.
Oh, OK. I misunderstood your previous comment about the tcpdump. I guess now we're just down to a few options 1- run distccd under valgrind to try to find a memory corruption bug that's scribbling over the buffer 2- run distccd under strace with -s 65536 to see what it's getting from the network and writing to the file 3- try stock 2.4.23 kernels if possible.
When running distccd under valgrind with various options, there is no sign of corruption, and all compiles fine. When running distccd strace'd, i'm getting getting various results, depending on the options for strace: more overhead seems to increase the chance of a succesful compile so I suppose this would indicate a race condition somewhere. Due to my logentries like "distcc[21143] (dcc_unlock) release lock fd4" i take a wild stab and assume that fd4 is the network socket and did a ' strace -e read -s 65535 ./distccd --wizard >& /tmp/distccd' which got this: read(4, "__rawmemchr (__retval, __reject) : strchr (__retval, __reject)))) != ((void *)0))\n *(*__s)++ = \'\\0\';\n return __retval;\n}\n\nextern __inline char *__strsep_2c (char **__s, char __reject1, char __reject2);\nextern __inline char *\n__strsep_2c (char **__s, char __reject1, char __reject2)\n{\n register char *__retval = *__s;\n if (__retval != ((void *)0))\n {\n register char *__cp = __retval;\n while (1)\n {\n if (*__cp == \'\\0\')\n {\n __cp = ((void *)0);\n break;\n }\n if (*__cp == __reject1 || *__cp == __reject2)\n {\n *__cp++ = \'\\0\';\n break;\n \0\0\0\0\0\0\0\0\n ++__cp;\n }\n *__s = __cp;\n }\n return __retval;\n}\n\nextern __inline char *__strsep_3c (char **__s, char __reject1, char __reject2,\n char __reject3);\nextern __inline char *\n__strsep_3c (char **__s, char __reject1, char __reject2, char __reject3)\n{\n register char *__retval = *__s;\n if (__retval != ((void *)0))\n {\n register char *__cp = __retval;\n while (1)\n {\n if (*__cp == \'\\0\')\n {\n __cp = ((void *)0);\n break;\n }\n if (*__cp == __reject1 || *__cp == __reject2 || *__cp == __reject3)\n {\n *__cp++ = \'\\0\';\n break;\n }\n ++__cp;\n }\n *__s = __cp;\n }\n return __retval;\n}\n# 12", 48604) = 1448 But the payload according to my simultanious tcpdump-capture on the server says it should be: 0000 00 60 97 d6 e0 d9 b4 66 c2 68 98 81 08 00 45 00 .`.
When running distccd under valgrind with various options, there is no sign of corruption, and all compiles fine. When running distccd strace'd, i'm getting getting various results, depending on the options for strace: more overhead seems to increase the chance of a succesful compile so I suppose this would indicate a race condition somewhere. Due to my logentries like "distcc[21143] (dcc_unlock) release lock fd4" i take a wild stab and assume that fd4 is the network socket and did a ' strace -e read -s 65535 ./distccd --wizard >& /tmp/distccd' which got this: read(4, "__rawmemchr (__retval, __reject) : strchr (__retval, __reject)))) != ((void *)0))\n *(*__s)++ = \'\\0\';\n return __retval;\n}\n\nextern __inline char *__strsep_2c (char **__s, char __reject1, char __reject2);\nextern __inline char *\n__strsep_2c (char **__s, char __reject1, char __reject2)\n{\n register char *__retval = *__s;\n if (__retval != ((void *)0))\n {\n register char *__cp = __retval;\n while (1)\n {\n if (*__cp == \'\\0\')\n {\n __cp = ((void *)0);\n break;\n }\n if (*__cp == __reject1 || *__cp == __reject2)\n {\n *__cp++ = \'\\0\';\n break;\n \0\0\0\0\0\0\0\0\n ++__cp;\n }\n *__s = __cp;\n }\n return __retval;\n}\n\nextern __inline char *__strsep_3c (char **__s, char __reject1, char __reject2,\n char __reject3);\nextern __inline char *\n__strsep_3c (char **__s, char __reject1, char __reject2, char __reject3)\n{\n register char *__retval = *__s;\n if (__retval != ((void *)0))\n {\n register char *__cp = __retval;\n while (1)\n {\n if (*__cp == \'\\0\')\n {\n __cp = ((void *)0);\n break;\n }\n if (*__cp == __reject1 || *__cp == __reject2 || *__cp == __reject3)\n {\n *__cp++ = \'\\0\';\n break;\n }\n ++__cp;\n }\n *__s = __cp;\n }\n return __retval;\n}\n# 12", 48604) = 1448 But the payload according to my simultanious tcpdump-capture on the server says it should be: 0000 00 60 97 d6 e0 d9 b4 66 c2 68 98 81 08 00 45 00 .`.ÖàÙ´f Âh....E. 0010 05 dc 9d aa 40 00 3f 06 cc 65 c0 a8 01 04 0a 00 .Ü.ª@.?. ÌeÀ¨.... 0020 00 60 8b 28 0e 30 64 fd b7 4c 41 40 e0 6d 80 10 .`.(.0dý ·LA@àm.. 0030 16 d0 ee cc 00 00 01 01 08 0a 33 b8 e3 a2 17 db .ÐîÌ.... ..3¸ã¢.Û 0040 66 67 5f 5f 72 61 77 6d 65 6d 63 68 72 20 28 5f fg__rawm emchr (_ 0050 5f 72 65 74 76 61 6c 2c 20 5f 5f 72 65 6a 65 63 _retval, __rejec 0060 74 29 20 3a 20 73 74 72 63 68 72 20 28 5f 5f 72 t) : str chr (__r 0070 65 74 76 61 6c 2c 20 5f 5f 72 65 6a 65 63 74 29 etval, _ _reject) 0080 29 29 29 20 21 3d 20 28 28 76 6f 69 64 20 2a 29 ))) != ( (void *) 0090 30 29 29 0a 20 20 20 20 2a 28 2a 5f 5f 73 29 2b 0)). *(*__s)+ 00a0 2b 20 3d 20 27 5c 30 27 3b 0a 20 20 72 65 74 75 + = '\0' ;. retu 00b0 72 6e 20 5f 5f 72 65 74 76 61 6c 3b 0a 7d 0a 0a rn __ret val;.}.. 00c0 65 78 74 65 72 6e 20 5f 5f 69 6e 6c 69 6e 65 20 extern _ _inline 00d0 63 68 61 72 20 2a 5f 5f 73 74 72 73 65 70 5f 32 char *__ strsep_2 00e0 63 20 28 63 68 61 72 20 2a 2a 5f 5f 73 2c 20 63 c (char **__s, c 00f0 68 61 72 20 5f 5f 72 65 6a 65 63 74 31 2c 20 63 har __re ject1, c 0100 68 61 72 20 5f 5f 72 65 6a 65 63 74 32 29 3b 0a har __re ject2);. 0110 65 78 74 65 72 6e 20 5f 5f 69 6e 6c 69 6e 65 20 extern _ _inline 0120 63 68 61 72 20 2a 0a 5f 5f 73 74 72 73 65 70 5f char *._ _strsep_ 0130 32 63 20 28 63 68 61 72 20 2a 2a 5f 5f 73 2c 20 2c (char **__s, 0140 63 68 61 72 20 5f 5f 72 65 6a 65 63 74 31 2c 20 char __r eject1, 0150 63 68 61 72 20 5f 5f 72 65 6a 65 63 74 32 29 0a char __r eject2). 0160 7b 0a 20 20 72 65 67 69 73 74 65 72 20 63 68 61 {. regi ster cha 0170 72 20 2a 5f 5f 72 65 74 76 61 6c 20 3d 20 2a 5f r *__ret val = *_ 0180 5f 73 3b 0a 20 20 69 66 20 28 5f 5f 72 65 74 76 _s;. if (__retv 0190 61 6c 20 21 3d 20 28 28 76 6f 69 64 20 2a 29 30 al != (( void *)0 01a0 29 29 0a 20 20 20 20 7b 0a 20 20 20 20 20 20 72 )). { . r 01b0 65 67 69 73 74 65 72 20 63 68 61 72 20 2a 5f 5f egister char *__ 01c0 63 70 20 3d 20 5f 5f 72 65 74 76 61 6c 3b 0a 20 cp = __r etval;. 01d0 20 20 20 20 20 77 68 69 6c 65 20 28 31 29 0a 20 whi le (1). 01e0 20 20 20 20 20 20 20 7b 0a 20 20 20 20 20 20 20 { . 01f0 20 20 20 69 66 20 28 2a 5f 5f 63 70 20 3d 3d 20 if (* __cp == 0200 27 5c 30 27 29 0a 20 20 20 20 20 20 20 20 20 20 '\0'). 0210 20 20 7b 0a 20 20 20 20 20 20 20 20 20 20 20 20 {. 0220 20 20 5f 5f 63 70 20 3d 20 28 28 76 6f 69 64 20 __cp = ((void 0230 2a 29 30 29 3b 0a 20 20 20 20 20 20 20 20 20 20 *)0);. 0240 62 72 65 61 6b 3b 0a 20 20 20 20 20 20 20 20 20 break;. 0250 20 20 20 7d 0a 20 20 20 20 20 20 20 20 20 20 69 }. i 0260 66 20 28 2a 5f 5f 63 70 20 3d 3d 20 5f 5f 72 65 f (*__cp == __re 0270 6a 65 63 74 31 20 7c 7c 20 2a 5f 5f 63 70 20 3d ject1 || *__cp = 0280 3d 20 5f 5f 72 65 6a 65 63 74 32 29 0a 20 20 20 = __reje ct2). 0290 20 20 20 20 20 20 20 20 20 7b 0a 20 20 20 20 20 {. 02a0 20 20 20 20 20 20 20 20 20 2a 5f 5f 63 70 2b 2b *__cp++ 02b0 20 3d 20 27 5c 30 27 3b 0a 20 20 20 20 20 20 20 = '\0'; . 02c0 20 20 20 20 20 20 20 62 72 65 61 6b 3b 0a 20 20 b reak;. 02d0 20 20 20 20 20 20 20 20 20 20 7d 0a 20 20 20 20 }. 02e0 20 20 20 20 20 20 2b 2b 5f 5f 63 70 3b 0a 20 20 ++ __cp;. 02f0 20 20 20 20 20 20 7d 0a 20 20 20 20 20 20 2a 5f }. *_ 0300 5f 73 20 3d 20 5f 5f 63 70 3b 0a 20 20 20 20 7d _s = __c p;. } 0310 0a 20 20 72 65 74 75 72 6e 20 5f 5f 72 65 74 76 . retur n __retv 0320 61 6c 3b 0a 7d 0a 0a 65 78 74 65 72 6e 20 5f 5f al;.}..e xtern __ 0330 69 6e 6c 69 6e 65 20 63 68 61 72 20 2a 5f 5f 73 inline c har *__s 0340 74 72 73 65 70 5f 33 63 20 28 63 68 61 72 20 2a trsep_3c (char * 0350 2a 5f 5f 73 2c 20 63 68 61 72 20 5f 5f 72 65 6a *__s, ch ar __rej 0360 65 63 74 31 2c 20 63 68 61 72 20 5f 5f 72 65 6a ect1, ch ar __rej 0370 65 63 74 32 2c 0a 20 20 20 20 20 20 20 20 20 20 ect2,. 0380 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 0390 20 20 20 20 20 20 20 20 20 63 68 61 72 20 5f 5f char __ 03a0 72 65 6a 65 63 74 33 29 3b 0a 65 78 74 65 72 6e reject3) ;.extern 03b0 20 5f 5f 69 6e 6c 69 6e 65 20 63 68 61 72 20 2a __inlin e char * 03c0 0a 5f 5f 73 74 72 73 65 70 5f 33 63 20 28 63 68 .__strse p_3c (ch 03d0 61 72 20 2a 2a 5f 5f 73 2c 20 63 68 61 72 20 5f ar **__s , char _ 03e0 5f 72 65 6a 65 63 74 31 2c 20 63 68 61 72 20 5f _reject1 , char _ 03f0 5f 72 65 6a 65 63 74 32 2c 20 63 68 61 72 20 5f _reject2 , char _ 0400 5f 72 65 6a 65 63 74 33 29 0a 7b 0a 20 20 72 65 _reject3 ).{. re 0410 67 69 73 74 65 72 20 63 68 61 72 20 2a 5f 5f 72 gister c har *__r 0420 65 74 76 61 6c 20 3d 20 2a 5f 5f 73 3b 0a 20 20 etval = *__s;. 0430 69 66 20 28 5f 5f 72 65 74 76 61 6c 20 21 3d 20 if (__re tval != 0440 28 28 76 6f 69 64 20 2a 29 30 29 29 0a 20 20 20 ((void * )0)). 0450 20 7b 0a 20 20 20 20 20 20 72 65 67 69 73 74 65 {. registe 0460 72 20 63 68 61 72 20 2a 5f 5f 63 70 20 3d 20 5f r char * __cp = _ 0470 5f 72 65 74 76 61 6c 3b 0a 20 20 20 20 20 20 77 _retval; . w 0480 68 69 6c 65 20 28 31 29 0a 20 20 20 20 20 20 20 hile (1) . 0490 20 7b 0a 20 20 20 20 20 20 20 20 20 20 69 66 20 {. if 04a0 28 2a 5f 5f 63 70 20 3d 3d 20 27 5c 30 27 29 0a (*__cp = = '\0'). 04b0 20 20 20 20 20 20 20 20 20 20 20 20 7b 0a 20 20 {. 04c0 20 20 20 20 20 20 20 20 20 20 20 20 5f 5f 63 70 __cp 04d0 20 3d 20 28 28 76 6f 69 64 20 2a 29 30 29 3b 0a = ((voi d *)0);. 04e0 20 20 20 20 20 20 20 20 20 20 62 72 65 61 6b 3b break; 04f0 0a 20 20 20 20 20 20 20 20 20 20 20 20 7d 0a 20 . }. 0500 20 20 20 20 20 20 20 20 20 69 66 20 28 2a 5f 5f if (*__ 0510 63 70 20 3d 3d 20 5f 5f 72 65 6a 65 63 74 31 20 cp == __ reject1 0520 7c 7c 20 2a 5f 5f 63 70 20 3d 3d 20 5f 5f 72 65 || *__cp == __re 0530 6a 65 63 74 32 20 7c 7c 20 2a 5f 5f 63 70 20 3d ject2 || *__cp = 0540 3d 20 5f 5f 72 65 6a 65 63 74 33 29 0a 20 20 20 = __reje ct3). 0550 20 20 20 20 20 20 20 20 20 7b 0a 20 20 20 20 20 {. 0560 20 20 20 20 20 20 20 20 20 2a 5f 5f 63 70 2b 2b *__cp++ 0570 20 3d 20 27 5c 30 27 3b 0a 20 20 20 20 20 20 20 = '\0'; . 0580 20 20 20 20 20 20 20 62 72 65 61 6b 3b 0a 20 20 b reak;. 0590 20 20 20 20 20 20 20 20 20 20 7d 0a 20 20 20 20 }. 05a0 20 20 20 20 20 20 2b 2b 5f 5f 63 70 3b 0a 20 20 ++ __cp;. 05b0 20 20 20 20 20 20 7d 0a 20 20 20 20 20 20 2a 5f }. *_ 05c0 5f 73 20 3d 20 5f 5f 63 70 3b 0a 20 20 20 20 7d _s = __c p;. } 05d0 0a 20 20 72 65 74 75 72 6e 20 5f 5f 72 65 74 76 . retur n __retv 05e0 61 6c 3b 0a 7d 0a 23 20 31 32 al;.}.# 12 Which shows an 8 byte difference where strace says 8x \00 and tcpdump 8x \20. I'll have a stab at trying to recreate these results with a newer kernel later today.
Thanks for the additional info. That log message refers to fd4 being used for the lock file in the client, but the fd assignments are of course different in distccd. So I assume you do in fact have the network stream in that strace line. It certainly does look like some 0x20 (space) characters in the network stream are being transmuted into 0x00 when read by distccd. That is pretty wierd. Assuming the traces are correct, I think you have a kernel bug on the server (athlonia?), or perhaps a very wierd hardware bug. Is it still running 2.4.22-gentoo-r1? Your first step should be to try kernel.org 2.4.24 or the Gentoo .24. Try to work out if it's only present in the Gentoo deltas; if so cc the gentoo kernel maintainer.
I am having similar problems. Both machines I am getting distcc up and running on have the exact same motherboard, memory, cpu, CFLAGS, distcc ver, gcc ver. The CFLAGS are the default CFLAGS that come on the athlon live cd. software used: gentoo-sources-2.4.22-r3 gcc-3.2.3-r3 distcc-2.11.1 I have this problem whether I am compiling a C++ or C ebuild. I can set distcc to just use localhost and I still have problems. The second I take "distcc" out of FEATURES in /etc/make.conf all my problems go away (well, except for the slow compiles :-) ). Both machines have been rock solid for months and have *never* had an emerge failure. I'm not using ~x86 either. If you need any more information I can try and accomodate your requests. I really would like to see this working. I am going to try an older distcc version to see if it fixes anything.
Kenneth, thanks for your report. What kernel are you using? When you say "I can set distcc to just use localhost and I still have problems", do you really mean setting it to "localhost", or setting it to 127.0.0.1? If the former then it's almost certainly a different problem, since distcc just invokes the compiler directly, and the network protocol is not involved. Since this problem seems to only be occurring on Gentoo it seems likely that it is a Gentoo kernel problem. If you want to try something else the most useful thing would be a kernel.org kernel.
I am using gentoo-sources-2.4.22-r3 for my kernel. I have tried (because of your input) both localhost and 127.0.0.1 . The problem occurs with either setting. I went ahead and tried distcc-2.9 without success also. I am going to try a couple of different vanilla kernels (vanilla-sources-2.4.20 and vanilla-sources-2.4.24) and see if there is any success. I must say I'm a bit skeptical on the kernel being the issue as I have had no file/network io corruption in other applications. I am ready to be surprised though :-P I would also like to add that despite the errors being mostly random, one of the packages I was using to test out a distcc emerge with would continually break in the same place (alas, I can't remember the package, but I will add a note if I can remember it).
If you can really reproduce this with DISTCC_HOSTS=localhost, then please post a verbose client log of such a failure, plus the compiler error messages.
I'm not having any problems with vanilla linux-2.4.25-pre4 or the gentoo-dev-sources-2.6.0 sources, so my bet is that (at least in my instance) the 2.4.22+gentoo patches is triggering something. Thanks Martin for pushing me in the right direction, I had a several hours of fun trying out new software =)
Can you kernel guys point us in the right place?
What nic/drivers are being used for the network between hosts? preferrably info about both ends.
For the information of the kernel people: As you can see earlier in the report, the data seems to be leaving the client correctly. The problem is somewhere on the read end. The code here is pretty straightforward: just a big old read(2) loop to pull the data into a memory buffer then write it to disk. The two main unusual factors are that distcc uses TCP_CORK to form big packets, and that it does very fast network IO compared to other programs. (It found a bug in 2.5 that only network benchmark programs could reproduce because it flooded so much traffic through.) The problem seems to be that the data is corrupt before it is returned from read. However, tcpdump seems to see the data on the receiver correctly. So perhaps it's somewhere in the TCP stack after the network driver.
Just as further confirmation, I am now using vanilla 2.4.24 and everything is working great.
Using gentoo-sources-2.4.22-r4 i'm able to recreate the faulty behaviour. Using vanilla-2.4.22 it works flawless. server and client use the same nic: Ethernet controller: 3Com Corporation 3c905 100BaseTX [Boomerang] (rev 0). using the vortex drivers
using vanilla-2.4.22 + the 036_fast-csum patch i'm able to recreate the faulty behavior. Seeing as this patch 'zeroes' out stuff (according to the comments anyway) I assume there's an off by one error there at times, but as its been 16 years since i did assembler on a m68k who knows =) The pertinent part of dmesg: Measuring network checksumming speed basic : 313.600 MB/sec simple : 236.800 MB/sec 3Dnow! : 659.200 MB/sec AMD-MMX : 659.200 MB/sec func SSE1+ skipped: not supported by CPU csum: using csum function: 3Dnow! basic : 230.400 MB/sec simple : 211.200 MB/sec AMD-MMX : 268.800 MB/sec func SSE1+ skipped: not supported by CPU func SSE1 skipped: not supported by CPU csum: using csum_copy function: AMD-MMX
athlonia root # cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 1 model name : AMD-K7(tm) Processor stepping : 2 cpu MHz : 704.964 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat mmx syscall mmxext 3dnowext 3dnow bogomips : 1405.74
gentoo-sources-2.4.22-r4 minus the 036_fast-csum patch does not show the faulty behaviour.
thanks for testing that out for us, I'm preparing to relase a -r5, it'll be fixed.
kernel bug. you guys can have it and close it. :)
-r5 (just added to cvs, so give it a few mins) is fixed, update when it shows up