Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 36230

Summary: garbage characters in compiles using distcc
Product: Portage Development Reporter: Peter Bienstman (RETIRED) <pbienst>
Component: UnclassifiedAssignee: x86-kernel (DEPRECATED) <x86-kernel>
Status: RESOLVED FIXED    
Severity: normal CC: bokkepoot, lisa, mbp
Priority: High    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: tcpdump from compilehost
source on receiver
source on sender
source on sender

Description Peter Bienstman (RETIRED) gentoo-dev 2003-12-21 04:58:35 UTC
Both my machines have the same version of qt (3.2.3), gcc (3.2.3-r3) and glibc (2.3.2-r3). When emerging kde using distcc, the compilations fails:

 In file included from /usr/qt/3/include/qtoolbar.h:42, 
                  from /usr/kde/3.1/include/ktoolbar.h:27, 
                  from kuickshow.cpp:52: 
 /usr/qt/3/include/qdockwindow.h:161:8: warning: null character(s) ignored 
 In file included from /usr/qt/3/include/qtoolbar.h:42, 
                  from /usr/kde/3.1/include/ktoolbar.h:27, 
                  from kuickshow.cpp:52: 
 /usr/qt/3/include/qdockwindow.h:161: syntax error before numeric constant 
 /usr/qt/3/include/qdockwindow.h:161: stray '\300' in program 
 /usr/qt/3/include/qdockwindow.h:161:16: warning: null character(s) ignored 
 distcc[10915] ERROR: compile on pbienst failed 
 make[3]: *** [kuickshow.lo] Error 1 
 make[3]: *** Waiting for unfinished jobs.... 
 make[3]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4/kuickshow/src' 
 
qdockwindow.h looks perfectly normal and is the same on both machines. Turning off distcc fixes the problem.




Reproducible: Always
Steps to Reproduce:
1.
2.
3.
Comment 1 Lisa Seelye (RETIRED) gentoo-dev 2003-12-21 08:59:46 UTC
hi.  please provide the information listed here: http://distcc.samba.org/problems.html

also provide emerge info for the machines being used.
Comment 2 Peter Bienstman (RETIRED) gentoo-dev 2003-12-21 12:31:52 UTC
These things seem hard to reproduce exactly. I did another run, which finished without any problem, presumably because the offending file got compiled locally rather than remotely. I then removed 'localhost' from the distcc hosts, did another run, and now I get a different error, but also one involving stray characters. It almost looks as if the preprocessed source didn't get transmitted correcty.

This is the end of the verbose log:

distcc[22618] (dcc_note_state) note state 2, file "slideshowwidget.cpp", host "pbienst"
distcc[22618] (dcc_connect_timed) nonblocking connect to 192.168.2.37:3632
distcc[22618] (dcc_select_for_write) select for write on fd4
distcc[22618] (dcc_connect_by_addr) client got connection to pbienst port 3632 on fd4
distcc[22618] (dcc_x_token_int) send DIST00000001
distcc[22618] (dcc_x_token_int) send ARGC00000018
distcc[22618] (dcc_x_token_int) send ARGV00000003
distcc[22618] (dcc_x_token_int) send ARGV00000012
distcc[22618] (dcc_x_token_int) send ARGV0000000e
distcc[22618] (dcc_x_token_int) send ARGV00000007
distcc[22618] (dcc_x_token_int) send ARGV00000005
distcc[22618] (dcc_x_token_int) send ARGV00000009
distcc[22618] (dcc_x_token_int) send ARGV00000002
distcc[22618] (dcc_x_token_int) send ARGV0000000f
distcc[22618] (dcc_x_token_int) send ARGV0000000f
distcc[22618] (dcc_x_token_int) send ARGV00000005
distcc[22618] (dcc_x_token_int) send ARGV0000000c
distcc[22618] (dcc_x_token_int) send ARGV0000000c
distcc[22618] (dcc_x_token_int) send ARGV00000003
distcc[22618] (dcc_x_token_int) send ARGV00000003
distcc[22618] (dcc_x_token_int) send ARGV0000000f
distcc[22618] (dcc_x_token_int) send ARGV0000000e
distcc[22618] (dcc_x_token_int) send ARGV00000014
distcc[22618] (dcc_x_token_int) send ARGV00000005
distcc[22618] (dcc_x_token_int) send ARGV0000000f
distcc[22618] (dcc_x_token_int) send ARGV0000000e
distcc[22618] (dcc_x_token_int) send ARGV00000002
distcc[22618] (dcc_x_token_int) send ARGV00000013
distcc[22618] (dcc_x_token_int) send ARGV00000002
distcc[22618] (dcc_x_token_int) send ARGV00000011
distcc[22618] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)"
distcc[22618] (dcc_collect_child) cpp child 22619 terminated with status 0
distcc[22618] (dcc_collect_child) cpp times: user 0.282000s, system 0.142000s, 35944 minflt, 707 majflt
distcc[22618] cpp on localhost completed ok
distcc[22618] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)"
distcc[22618] (dcc_x_file) send 1123720 byte file /var/tmp/portage/kdegraphics-3.1.4/temp/distcc_7820e682.ii with token DOTI
distcc[22618] (dcc_x_token_int) send DOTI00112588
distcc[22618] (dcc_send_job) client finished sending request to server
distcc[22618] (dcc_note_state) note state 5, file "(NULL)", host "pbienst"
distcc[22618] (dcc_r_token_int) got DONE00000001
distcc[22618] (dcc_r_result_header) got response header
distcc[22618] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)"
distcc[22618] (dcc_r_token_int) got STAT00000000
distcc[22618] (dcc_r_token_int) got SERR00000000
distcc[22618] (dcc_r_token_int) got SOUT00000000
distcc[22618] (dcc_r_token_int) got DOTO00004130
distcc[22618] (dcc_r_file) received 16688 bytes to file slideshowwidget.o
distcc[22618] (dcc_r_file_timed) 16688 bytes received in 0.000852s, rate 19128kB/s
distcc[22618] (dcc_unlock) release lock fd3
distcc[22618] compile on pbienst completed ok
distcc[22618] elapsed compilation time 3.138399s
distcc[22618] (dcc_exit) exit: code 0; self: 0.001000 user 0.011000 sys; children: 0.282000 user 0.142000 sys
distcc[22618] (dcc_cleanup_tempfiles) deleted 1 temporary files
/usr/qt/3/bin/moc ./printing.h -o printing.moc
/bin/sh ../../libtool --silent --mode=compile --tag=CXX g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include   -DQT_THREAD_SUPPORT  -D_REENTRANT  -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST  -c -o printing.lo `test -f 'printing.cpp' || echo './'`printing.cpp
distcc[22771] (dcc_trace_version) distcc 2.11.1 i686-pc-linux-gnu; built Dec 21 2003 10:17:18
distcc[22771] (dcc_recursion_safeguard) safeguard level=0
distcc[22771] (dcc_set_path) setting PATH=/sbin:/usr/sbin:/usr/lib/portage/bin:/bin:/usr/bin:/usr/local/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.2:/usr/X11R6/bin:/opt/blackdown-jdk-1.4.1/bin:/opt/blackdown-jdk-1.4.1/jre/bin:/usr/qt/3/bin:/usr/kde/3.1/sbin:/usr/kde/3.1/bin
distcc[22771] (dcc_scan_args) scanning arguments: g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -c printing.cpp -DPIC
distcc[22771] (dcc_scan_args) found input file "printing.cpp"
distcc[22771] (dcc_scan_args) no visible output file, going to add "-o printing.o" at end
distcc[22771] compile from printing.cpp to printing.o
distcc[22771] (dcc_get_hostlist) not reading /var/tmp/portage/kdegraphics-3.1.4/temp/fakehome/.distcc/hosts: No such file or directory
distcc[22771] (dcc_parse_hosts_file) load hosts from /etc/distcc/hosts
distcc[22771] (dcc_parse_hosts) found tcp token "pbienst"
distcc[22771] (dcc_lock_host) got cpu lock on pbienst slot 0 as fd3
distcc[22771] (dcc_strip_dasho) result: g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -c printing.cpp -DPIC
distcc[22771] (dcc_spawn_child) forking to execute: g++ -DHAVE_CONFIG_H -I. -I. -I../.. -I/usr/kde/3.1/include -I/usr/qt/3/include -I/usr/X11R6/include -DQT_THREAD_SUPPORT -D_REENTRANT -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -DNDEBUG -DNO_DEBUG -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -E printing.cpp -DPIC
distcc[22771] (dcc_spawn_child) child started as pid22772
distcc[22771] (dcc_strip_local_args) result: g++ -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -Wcast-align -Wconversion -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -c printing.cpp -o printing.o
distcc[22771] exec on pbienst: g++ -Wnon-virtual-dtor -Wno-long-long -Wundef -Wall -pedantic -W -Wpointer-arith -Wwrite-strings -ansi -Wcast-align -Wconversion -O2 -O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe -fno-exceptions -fno-check-new -c printing.cpp -o printing.o
distcc[22771] (dcc_note_state) note state 2, file "printing.cpp", host "pbienst"
distcc[22771] (dcc_connect_timed) nonblocking connect to 192.168.2.37:3632
distcc[22771] (dcc_select_for_write) select for write on fd4
distcc[22772] (dcc_increment_safeguard) setting safeguard: _DISTCC_SAFEGUARD=1
distcc[22771] (dcc_connect_by_addr) client got connection to pbienst port 3632 on fd4
distcc[22771] (dcc_x_token_int) send DIST00000001
distcc[22771] (dcc_x_token_int) send ARGC00000018
distcc[22771] (dcc_x_token_int) send ARGV00000003
distcc[22771] (dcc_x_token_int) send ARGV00000012
distcc[22771] (dcc_x_token_int) send ARGV0000000e
distcc[22771] (dcc_x_token_int) send ARGV00000007
distcc[22771] (dcc_x_token_int) send ARGV00000005
distcc[22771] (dcc_x_token_int) send ARGV00000009
distcc[22771] (dcc_x_token_int) send ARGV00000002
distcc[22771] (dcc_x_token_int) send ARGV0000000f
distcc[22771] (dcc_x_token_int) send ARGV0000000f
distcc[22771] (dcc_x_token_int) send ARGV00000005
distcc[22771] (dcc_x_token_int) send ARGV0000000c
distcc[22771] (dcc_x_token_int) send ARGV0000000c
distcc[22771] (dcc_x_token_int) send ARGV00000003
distcc[22771] (dcc_x_token_int) send ARGV00000003
distcc[22771] (dcc_x_token_int) send ARGV0000000f
distcc[22771] (dcc_x_token_int) send ARGV0000000e
distcc[22771] (dcc_x_token_int) send ARGV00000014
distcc[22771] (dcc_x_token_int) send ARGV00000005
distcc[22771] (dcc_x_token_int) send ARGV0000000f
distcc[22771] (dcc_x_token_int) send ARGV0000000e
distcc[22771] (dcc_x_token_int) send ARGV00000002
distcc[22771] (dcc_x_token_int) send ARGV0000000c
distcc[22771] (dcc_x_token_int) send ARGV00000002
distcc[22771] (dcc_x_token_int) send ARGV0000000a
distcc[22771] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)"
distcc[22771] (dcc_collect_child) cpp child 22772 terminated with status 0
distcc[22771] (dcc_collect_child) cpp times: user 0.361000s, system 0.205000s, 49056 minflt, 839 majflt
distcc[22771] cpp on localhost completed ok
distcc[22771] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)"
distcc[22771] (dcc_x_file) send 1430733 byte file /var/tmp/portage/kdegraphics-3.1.4/temp/distcc_c03ae686.ii with token DOTI
distcc[22771] (dcc_x_token_int) send DOTI0015d4cd
distcc[22771] (dcc_send_job) client finished sending request to server
distcc[22771] (dcc_note_state) note state 5, file "(NULL)", host "pbienst"
distcc[22771] (dcc_r_token_int) got DONE00000001
distcc[22771] (dcc_r_result_header) got response header
distcc[22771] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)"
distcc[22771] (dcc_r_token_int) got STAT00000100
distcc[22771] (dcc_r_token_int) got SERR00000e64
In file included from /usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/map:66,
                 from /usr/qt/3/include/qmap.h:51,
                 from /usr/qt/3/include/qmime.h:43,
                 from /usr/qt/3/include/qevent.h:45,
                 from /usr/qt/3/include/qobject.h:45,
                 from /usr/qt/3/include/qwidget.h:43,
                 from /usr/qt/3/include/qbutton.h:42,
                 from /usr/qt/3/include/qcheckbox.h:42,
                 from printing.cpp:19:
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h: In
   member function `std::_Rb_tree_iterator<_Val, _Val&, _Val*>
   std::_Rb_tree<_Key, _Val, _KeyOfValue, _Compare,
   _Alloc>::insert_equal(std::_Rb_tree_iterator<_Val, _Val&, _Val*>, const
   _Val&)':
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: stray
   '\10' in program
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: stray
   '\4' in program
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: stray
   '\220' in program
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: parse
   error before `@' token
/usr/lib/gcc-lib/i686-pc-linux-gnu/3.2.3/include/g++-v3/bits/stl_tree.h:1110: syntax
   error before `.' token
In file included from printing.cpp:27:
/usr/qt/3/include/qpainter.h:650:45: warning: null character(s) ignored
In file included from printing.cpp:27:
/usr/qt/3/include/qpainter.h: At global scope:
/usr/qt/3/include/qpainter.h:650: stray '\262' in program
/usr/qt/3/include/qpainter.h:651: parse error before `@' token
/usr/qt/3/include/qpainter.h:651:4: warning: null character(s) ignored
/usr/qt/3/include/qpainter.h:651: syntax error before `&' token
/usr/qt/3/include/qpainter.h:653: prototype for `void
   QPainter::drawTiledPixmap(...)' does not match any in class `QPainter'
/usr/qt/3/include/qpainter.h:231: candidates are: void
   QPainter::drawTiledPixmap(const QRect&, const QPixmap&)
/usr/qt/3/include/qpainter.h:230:                 void
   QPainter::drawTiledPixmap(const QRect&, const QPixmap&, const QPoint&)
/usr/qt/3/include/qpainter.h:228:                 void
   QPainter::drawTiledPixmap(int, int, int, int, const QPixmap&, int = 0, int =
   0)
/usr/qt/3/include/qpainter.h: In member function `void
   QPainter::drawTiledPixmap(...)':
/usr/qt/3/include/qpainter.h:654: `r' undeclared (first use this function)
/usr/qt/3/include/qpainter.h:654: (Each undeclared identifier is reported only
   once for each function it appears in.)
/usr/qt/3/include/qpainter.h:654: `pm' undeclared (first use this function)
/usr/qt/3/include/qpainter.h:654: `sp' undeclared (first use this function)
In file included from /usr/include/Imlib_types.h:1,
                 from /usr/include/Imlib.h:4,
                 from imlibwidget.h:33,
                 from imagewindow.h:26,
                 from printing.cpp:40:
/usr/X11R6/include/X11/Xlib.h: At global scope:
/usr/X11R6/include/X11/Xlib.h:1298: stray '\240' in program
/usr/X11R6/include/X11/Xlib.h:1298: stray '\231' in program
/usr/X11R6/include/X11/Xlib.h:1298: stray '\231' in program
/usr/X11R6/include/X11/Xlib.h:1298: parse error before `@' token
/usr/X11R6/include/X11/Xlib.h:1298: syntax error before `encoding_is_wchar'
printing.cpp: In member function `void
   KuickPrintDialogPage::setScaleWidth(int)':
printing.cpp:302: warning: passing `float' for argument 1 of `void
   KIntNumInput::setValue(int)'
printing.cpp: In member function `void
   KuickPrintDialogPage::setScaleHeight(int)':
printing.cpp:307: warning: passing `float' for argument 1 of `void
   KIntNumInput::setValue(int)'
distcc[22771] (dcc_r_token_int) got SOUT00000000
distcc[22771] (dcc_r_token_int) got DOTO00000000
distcc[22771] (dcc_unlock) release lock fd3
distcc[22771] ERROR: compile on pbienst failed
distcc[22771] elapsed compilation time 4.195878s
distcc[22771] (dcc_exit) exit: code 1; self: 0.004000 user 0.008000 sys; children: 0.361000 user 0.205000 sys
distcc[22771] (dcc_cleanup_tempfiles) deleted 1 temporary files
make[3]: *** [printing.lo] Error 1
make[3]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4/kuickshow/src'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4/kuickshow'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/var/tmp/portage/kdegraphics-3.1.4/work/kdegraphics-3.1.4'
make: *** [all] Error 2

emerge info for the remote host:

Portage 2.0.49-r15 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r3, 2.4.22-gentoo-test-r1)
=================================================================
System uname: 2.4.22-gentoo-test-r1 i686 Intel(R) Pentium(R) 4 CPU 2.40GHz
Gentoo Base System version 1.4.3.10p1
distcc 2.11.1 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-march=pentium4 -O3 -funroll-loops -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/cvs/share/config /usr/kde/3.1/share/config /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/config"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-march=pentium4 -O3 -funroll-loops -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="sandbox ccache autoaddcvs distcc"
GENTOO_MIRRORS="ftp://ftp.snt.utwente.nl/pub/os/linux/gentoo/ "
MAKEOPTS="-j2"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="x86 oss avi foomaticdb gif gtk2 libg++ mad mikmod nls gtkhtml gdbm berkdb slang arts aalib bonobo guile tcpd libwww esd -ldap mysql qt usb p44da motif alsa acpi atlas apm cdr crypt cups dga directfb dvd encode gphoto2 flash gpm qtmt imap imlib java jpeg jikes kde mpeg mmx ncurses opengl oggvorbis pam pda pdflib perl plotutils pic png pnp python quicktime readline samba sdl spell sse ssl tcltk svga tetex tiff truetype wmf xml xml2 xmms xv zlib X gtk gnome"

energe info for the local host:

Portage 2.0.49-r15 (default-x86-1.4, gcc-3.2.3, glibc-2.3.2-r3, 2.4.20-gentoo-r9)
=================================================================
System uname: 2.4.20-gentoo-r9 i686 Intel(R) Pentium(R) M processor 1000MHz
Gentoo Base System version 1.4.3.10
distcc[22896] (dcc_trace_version) distcc 2.11.1 i686-pc-linux-gnu; built Dec 21 2003 10:17:18 [enabled]
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /var/qmail/control /usr/kde/2/share/config /usr/kde/3/share/config /usr/X11R6/lib/X11/xkb /usr/kde/3.1/share/config /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/config"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-O3 -march=pentium4 -funroll-loops -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="sandbox ccache autoaddcvs distcc"
GENTOO_MIRRORS="http://212.219.247.15/sites/www.ibiblio.org/gentoo/ http://212.219.247.18/sites/www.ibiblio.org/gentoo/ http://212.219.247.11/sites/www.ibiblio.org/gentoo/ http://212.219.247.12/sites/www.ibiblio.org/gentoo/"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="x86 oss apm foomaticdb gpm gtk2 libg++ mad mikmod ncurses nls png gdbm berkdb slang svga sdl tcpd libwww perl gtk motif X qt kde -gnome acpi alsa arts atlas avi bidi cdr crypt cups dga directfb dvd encode emacs ethereal gif gphoto2 guile imap imlib jpeg java ldap lirc mmx mpeg mpi mysql oggvorbis opengl pam pda pcmcia pdflib plotutils pnp python quicktime readline samba spell sse ssl tcltk tetex tiff truetype unicode usb videos wmf zlib xv xml2 xmms"


distcc version is 2.11.1
Comment 3 Peter Bienstman (RETIRED) gentoo-dev 2004-01-02 03:46:25 UTC
Somebody else seems to have similar problems. Quoting from this forum thread:

http://forums.gentoo.org/viewtopic.php?t=117112&highlight=distcc

[quote]

I'm encountering similar problems using gcc (GCC) 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r2, propolice), ccache 2.3 and distcc-2.11.2-r1 on a setup with 2 athlons. 
 
 The most remarkable thing though is that it appears that the erronous file is being sent twice to compile on the remote host: 
 
 Dec 24 17:49:11 athlonia distccd[18206]: (dcc_set_input) changed input from "/tmp/.ccache/fakes.tmp.bluebird.14705.i" to "/tmp/distccd_ee93c387.i" 
 Dec 24 17:49:12 athlonia distccd[18204]: (dcc_set_input) changed input from "fakes.c" to "/tmp/distccd_e808c387.i" 
 
 And when i diff those files i get 

 athlonia tmp # diff distccd_e808c387.i distccd_ee93c387.i 
 469c469 
 < struct __sched_param __schedparGe
Comment 4 Peter Bienstman (RETIRED) gentoo-dev 2004-01-02 03:46:25 UTC
Somebody else seems to have similar problems. Quoting from this forum thread:

http://forums.gentoo.org/viewtopic.php?t=117112&highlight=distcc

[quote]

I'm encountering similar problems using gcc (GCC) 3.2.3 20030422 (Gentoo Linux 1.4 3.2.3-r2, propolice), ccache 2.3 and distcc-2.11.2-r1 on a setup with 2 athlons. 
 
 The most remarkable thing though is that it appears that the erronous file is being sent twice to compile on the remote host: 
 
 Dec 24 17:49:11 athlonia distccd[18206]: (dcc_set_input) changed input from "/tmp/.ccache/fakes.tmp.bluebird.14705.i" to "/tmp/distccd_ee93c387.i" 
 Dec 24 17:49:12 athlonia distccd[18204]: (dcc_set_input) changed input from "fakes.c" to "/tmp/distccd_e808c387.i" 
 
 And when i diff those files i get 

 athlonia tmp # diff distccd_e808c387.i distccd_ee93c387.i 
 469c469 
 < struct __sched_param __schedparGeÀÆót __inheritsched; 
 --- 
 > struct __sched_param __schedpart __inheritsched; 
 
 And it shouldn't be a mystery what's wrong with that. 
 
 I'll be locking myself up over xmas to see if its ccache or distcc or cosmic rays. 
 [/quote]
Comment 5 Lisa Seelye (RETIRED) gentoo-dev 2004-01-02 11:59:06 UTC
Why is it that KDE stuff fails with Distcc?

Martin, is this a Distcc-related quirk?
Comment 6 jan kuipers 2004-01-03 08:45:13 UTC
I've got a nice reproducable non-kde testcase: samba-3.0.1-r1

Using 2 machines, bluebird and athlonia, doing an 'emerge samba' on bluebird with 'distcc-config --set-hosts' at athlonia/5 it always stops on the same place with the same stray tokens.

Using distcc 2.12 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632)
on both machines, 

From distcc.log on bluebird: 

distcc[10053] (dcc_spawn_child) child started as pid10058
distcc[10053] (dcc_strip_local_args) result: gcc -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o
distcc[10053] exec on athlonia/5: gcc -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o
distcc[10053] (dcc_note_state) note state 2, file "snprintf.c", host "athlonia"
distcc[10053] (dcc_connect_timed) nonblocking connect to 10.0.0.96:3632
distcc[10053] (dcc_select_for_write) select for write on fd8
distcc[10053] (dcc_connect_by_addr) client got connection to athlonia port 3632 on fd8
distcc[10053] (dcc_x_token_int) send DIST00000001
distcc[10053] (dcc_x_token_int) send ARGC00000012
distcc[10053] (dcc_x_token_int) send ARGV00000003
distcc[10053] (dcc_x_token_int) send ARGV00000013
distcc[10053] (dcc_x_token_int) send ARGV00000003
distcc[10053] (dcc_x_token_int) send ARGV00000005
distcc[10053] (dcc_x_token_int) send ARGV00000005
distcc[10053] (dcc_x_token_int) send ARGV00000007
distcc[10053] (dcc_x_token_int) send ARGV00000014
distcc[10053] (dcc_x_token_int) send ARGV0000000b
distcc[10053] (dcc_x_token_int) send ARGV0000000e
distcc[10053] (dcc_x_token_int) send ARGV0000000c
distcc[10053] (dcc_x_token_int) send ARGV00000013
distcc[10053] (dcc_x_token_int) send ARGV0000001a
distcc[10053] (dcc_x_token_int) send ARGV0000000a
distcc[10053] (dcc_x_token_int) send ARGV00000005
distcc[10053] (dcc_x_token_int) send ARGV00000002
distcc[10053] (dcc_x_token_int) send ARGV0000000e
distcc[10053] (dcc_x_token_int) send ARGV00000002
distcc[10053] (dcc_x_token_int) send ARGV0000000e
distcc[10053] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)"
distcc[10053] (dcc_collect_child) cpp child 10058 terminated with status 0
distcc[10053] (dcc_collect_child) cpp times: user 0.117982s, system 0.076988s, 6159 minflt, 593 majflt
distcc[10053] cpp on localhost completed ok
distcc[10053] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)"
distcc[10053] (dcc_x_file) send 62615 byte file /var/tmp/portage/samba-3.0.1-r1/temp/distcc_cbdfe644.i with token DOTI
distcc[10053] (dcc_x_token_int) send DOTI0000f497
distcc[10053] (dcc_send_job) client finished sending request to server
distcc[10053] (dcc_note_state) note state 5, file "(NULL)", host "athlonia"
distcc[10053] (dcc_r_token_int) got DONE00000001
distcc[10053] (dcc_r_result_header) got response header
distcc[10053] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)"
distcc[10053] (dcc_r_token_int) got STAT00000100
distcc[10053] (dcc_r_token_int) got SERR00000a05
distcc[10053] (dcc_r_token_int) got SOUT00000000
distcc[10053] (dcc_r_token_int) got DOTO00000000
distcc[10053] (dcc_unlock) release lock fd7
distcc[10053] ERROR: compile on athlonia/5 failed
distcc[10053] elapsed compilation time 0.364662s
distcc[10053] (dcc_exit) exit: code 1; self: 0.010998 user 0.011998 sys; children: 0.258960 user 0.146977 sys
distcc[10053] (dcc_cleanup_tempfiles) deleted 1 temporary files

Console output from 'emerge samba' on bluebird:
Compiling lib/bitmap.c
Compiling lib/crc32.c
Compiling lib/snprintf.c
In file included from /usr/include/string.h:375,
                 from lib/snprintf.c:113:
/usr/include/bits/string2.h:1002: error: stray '\252' in program
/usr/include/bits/string2.h:1002: error: syntax error before "o"
/usr/include/bits/string2.h:1002: error: stray '\341' in program
/usr/include/bits/string2.h:1002: error: stray '\377' in program
/usr/include/bits/string2.h:1002: error: stray '\312' in program
/usr/include/bits/string2.h:1002: error: stray '\373' in program
/usr/include/bits/string2.h: In function `__strspn_c3':
/usr/include/bits/string2.h:1003: error: number of arguments doesn't match proto
type
/usr/include/bits/string2.h:1000: error: prototype declaration
/usr/include/bits/string2.h:1006: error: `__s' undeclared (first use in this fun
ction)
/usr/include/bits/string2.h:1006: error: (Each undeclared identifier is reported
 only once
/usr/include/bits/string2.h:1006: error: for each function it appears in.)
/usr/include/bits/string2.h:1006: error: `__accept1' undeclared (first use in th
is function)
/usr/include/bits/string2.h:1007: error: `__accept2' undeclared (first use in th
is function)
/usr/include/bits/string2.h:1007: error: `__accept3' undeclared (first use in th
is function)
In file included from /usr/include/string.h:375,
                 from lib/snprintf.c:113:
/usr/include/bits/string2.h: At top level:
/usr/include/bits/string2.h:1235: error: syntax error before "siz"
/usr/include/bits/string2.h:1235: error: stray '\252' in program
/usr/include/bits/string2.h:1235: error: stray '\341' in program
/usr/include/bits/string2.h:1235: error: stray '\377' in program
/usr/include/bits/string2.h:1235: error: stray '\312' in program
/usr/include/bits/string2.h:1235: error: stray '\373' in program
In file included from lib/snprintf.c:122:
/usr/include/sys/types.h:135: error: stray '\252' in program
/usr/include/sys/types.h:135: error: stray '\341' in program
/usr/include/sys/types.h:135: error: stray '\377' in program
/usr/include/sys/types.h:135: error: syntax error before "P"
/usr/include/sys/types.h:135: error: stray '\312' in program
/usr/include/sys/types.h:135: error: stray '\373' in program
/usr/include/sys/types.h:135: error: syntax error before "__useconds_t"
line-map.c: file "/usr/include/bits/pthreadtypes.h" left but not entered
In file included from lib/snprintf.c:125:
/usr/include/stdlib.h:166: error: stray '\252' in program
/usr/include/stdlib.h:166: error: syntax error before "o"
/usr/include/stdlib.h:166: error: stray '\341' in program
/usr/include/stdlib.h:166: error: stray '\377' in program
/usr/include/stdlib.h:166: error: stray '\312' in program
/usr/include/stdlib.h:166: error: stray '\373' in program
/usr/include/stdlib.h:314: error: stray '\252' in program
/usr/include/stdlib.h:314: error: stray '\341' in program
/usr/include/stdlib.h:314: error: stray '\377' in program
/usr/include/stdlib.h:314: error: syntax error before "P"
/usr/include/stdlib.h:314: error: stray '\312' in program
/usr/include/stdlib.h:314: error: stray '\373' in program
/usr/include/stdlib.h: In function `strtol':
/usr/include/stdlib.h:316: error: number of arguments doesn't match prototype
/usr/include/stdlib.h:177: error: prototype declaration
/usr/include/stdlib.h:317: error: `__nptr' undeclared (first use in this functio
n)
/usr/include/stdlib.h:317: error: `__endptr' undeclared (first use in this funct
ion)
/usr/include/stdlib.h:317: error: `__base' undeclared (first use in this functio
n)
/usr/include/stdlib.h: At top level:
/usr/include/stdlib.h:526: error: stray '\252' in program
/usr/include/stdlib.h:526: error: syntax error before "o"
/usr/include/stdlib.h:526: error: stray '\341' in program
/usr/include/stdlib.h:526: error: stray '\377' in program
/usr/include/stdlib.h:526: error: stray '\312' in program
/usr/include/stdlib.h:526: error: stray '\373' in program
In file included from lib/snprintf.c:125:
/usr/include/stdlib.h:803: error: stray '\252' in program
/usr/include/stdlib.h:803: error: syntax error before "o"
/usr/include/stdlib.h:803: error: stray '\341' in program
/usr/include/stdlib.h:803: error: stray '\377' in program
/usr/include/stdlib.h:803: error: stray '\312' in program
/usr/include/stdlib.h:803: error: stray '\373' in program
In file included from /usr/include/_G_config.h:44,
                 from /usr/include/libio.h:32,
                 from /usr/include/stdio.h:72,
                 from lib/snprintf.c:130:
/usr/include/gconv.h:66: error: stray '\252' in program
/usr/include/gconv.h:66: error: stray '\341' in program
/usr/include/gconv.h:66: error: stray '\377' in program
/usr/include/gconv.h:66: error: syntax error before "P"
/usr/include/gconv.h:66: error: stray '\312' in program
/usr/include/gconv.h:66: error: stray '\373' in program
In file included from /usr/include/stdio.h:72,
                 from lib/snprintf.c:130:
/usr/include/libio.h:351: error: stray '\252' in program
/usr/include/libio.h:351: error: syntax error before "o"
/usr/include/libio.h:351: error: stray '\341' in program
/usr/include/libio.h:351: error: stray '\377' in program
/usr/include/libio.h:351: error: stray '\312' in program
/usr/include/libio.h:351: error: stray '\373' in program
/usr/include/libio.h:376: error: syntax error before "cookie_read_function_t"
/usr/include/libio.h:384: error: syntax error before "__io_read_fn"
/usr/include/libio.h:388: error: syntax error before '}' token
/usr/include/libio.h:389: error: syntax error before "cookie_io_functions_t"
/usr/include/libio.h:395: error: syntax error before "_IO_cookie_io_functions_t"
In file included from lib/snprintf.c:130:
/usr/include/stdio.h:281: error: syntax error before "_IO_cookie_io_functions_t"
/usr/include/stdio.h:290: error: stray '\252' in program
/usr/include/stdio.h:290: error: syntax error before "o"
/usr/include/stdio.h:290: error: stray '\341' in program
/usr/include/stdio.h:290: error: stray '\377' in program
/usr/include/stdio.h:290: error: stray '\312' in program
/usr/include/stdio.h:290: error: stray '\373' in program
/usr/include/stdio.h:560: error: stray '\252' in program
/usr/include/stdio.h:560: error: syntax error before "o"
/usr/include/stdio.h:560: error: stray '\341' in program
/usr/include/stdio.h:560: error: stray '\377' in program
/usr/include/stdio.h:560: error: stray '\312' in program
/usr/include/stdio.h:560: error: stray '\373' in program
make: *** [lib/snprintf.o] Error 1

!!! ERROR: net-fs/samba-3.0.1-r1 failed.
!!! Function src_compile, Line 169, Exitcode 2
!!! SAMBA pieces

The 2 'emerge info' 's: 

athlonia root # emerge info
/tmp/.packages
Portage 2.0.50_pre9 (default-x86-1.4, gcc-3.3.2, glibc-2.3.3_pre20031212-r0, 2.4
.22-gentoo-r1)
=================================================================
System uname: 2.4.22-gentoo-r1 i686 AMD-K7(tm) Processor
Gentoo Base System version 1.4.3.12
distcc 2.12 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
Autoconf: sys-devel/autoconf-2.58
Automake: sys-devel/automake-1.7.8
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="yes"
CFLAGS="-march=athlon -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -
funroll-loops -fforce-addr -falign-functions=4"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3/s
hare/config /usr/share/config /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-march=athlon -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math
 -funroll-loops -fforce-addr -falign-functions=4"
DISTDIR="/tmp/distfiles"
FEATURES="autoaddcvs ccache distcc sandbox userpriv usersandbox"
GENTOO_MIRRORS="http://ftp.snt.utwente.nl/pub/os/linux/gentoo ftp://ftp.snt.utwe
nte.nl/pub/os/linux/gentoo"
MAKEOPTS="-j1"
PKGDIR="/tmp/.packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="3dnow X X509 amd apache2 autofs berkdb crypt cscope curl ethereal gd gdbm i
mlib ipv6 jpeg ldap libg++ libwww md5sum memlimit mmx mpi mysql ncurses nocstrik
e nodod nojoystick noqmax notfc objc offensive oggvorbis opengl openssh pam parse-clocks passfile pcap pcmcia pda pdflib perl php physfs pic plotutils png python readline ruby sasl sdl skey slang slp snmp spell sse ssl tcltk tcpd tiff truetype usb vim-with-x wmf x86 xml xml2 zlib"

bluebird root # emerge info
/usr/portage/packages
Portage 2.0.50_pre9 (default-x86-1.4, gcc-3.3.2, glibc-2.3.3_pre20031212-r0, 2.6.0-gentoo)
=================================================================
System uname: 2.6.0-gentoo i686 AMD Athlon(tm) processor
Gentoo Base System version 1.4.3.12
distcc 2.12 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
ccache version 2.3 [enabled]
Autoconf: sys-devel/autoconf-2.58
Automake: sys-devel/automake-1.7.8
ACCEPT_KEYWORDS="x86 ~x86"
AUTOCLEAN="no"
CFLAGS="-march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/env.d"
CXXFLAGS="-march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args"
DISTDIR="/tmp/distfiles"
FEATURES="autoaddcvs buildpkg ccache distcc sandbox userpriv usersandbox"
GENTOO_MIRRORS="http://www.mirror.ac.uk/sites/www.ibiblio.org/gentoo/"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.nl.gentoo.org/gentoo-portage"
USE="3dnow X X509 aalib amd apache2 apm arts artswrappersuid autofs avi berkdb cdr crypt cscope cups curl dga doc encode ethereal faad ffmpeg foomaticdb foreign-sysvinit gd gdbm ggi gif ginac glut gmtfull gmthigh gmtsuppl gmttria gtk gtk2 imap imlib ipv6 jabber java javascript jikes jpeg junit kde ldap libg++ libwww mad md5sum memlimit mmx motif mozilla moznocompose moznoirc moznomail mpeg mpi msn mysql ncurses nocstrike nodod nojoystick noqmax notfc nvidia nviz objc offensive oggvorbis opengl openssh oscar oss pam parse-clocks passfile pcap pcmcia pda pdflib perl php physfs pic plotutils png python qhull qt quicktime readline ruby samba sasl sdk sdl skey slang slp snmp spell sse ssl tcltk tcpd tetex tiff truetype usb v4l vim-with-x wmf x86 xchatnogtk xinerama xml xml2 xmms xosd xv xvid zeo zlib zvbi"


Comment 7 Martin Pool 2004-01-03 19:16:59 UTC
If anyone can reproduce this:

Please set DISTCC_SAVE_TEMPS on both client and server, and post the temporary files corresponding to the run that failed.

Please post the server log messages as well as the client ones.

If you could also include a tcpdump that would be helpful.

Can you reproduce the failures using a kernel.org kernel?

Is everyone seeing this bug using the propolice patches?

Thanks
Comment 8 Peter Bienstman (RETIRED) gentoo-dev 2004-01-04 01:18:31 UTC
Some additional info: my gcc install is 3.2.3-r3 and thus has the propolice patch, but I haven't used the -fstack-protector option anywhere. My client kernel is a stock 2.6.0, server kernel is 2.4.22-gentoo-test-r1  
Comment 9 jan kuipers 2004-01-04 12:40:56 UTC
Created attachment 23153 [details]
tcpdump from compilehost
Comment 10 jan kuipers 2004-01-04 12:47:28 UTC
Created attachment 23154 [details]
source on receiver
Comment 11 jan kuipers 2004-01-04 12:55:20 UTC
Created attachment 23155 [details]
source on sender
Comment 12 jan kuipers 2004-01-04 12:59:26 UTC
Same setup, emerging on bluebird, sending jobs to athlonia: 

Log from bluebird: 

distcc[27816] (dcc_trace_version) distcc 2.12 i686-pc-linux-gnu; built Dec 29 2003 15:47:39
distcc[27816] (dcc_recursion_safeguard) safeguard level=0
distcc[27816] (main) compiler name is "gcc"
distcc[27816] (dcc_set_path) setting PATH=/usr/bin:/usr/sbin:/usr/local/bin:/opt/bin:/usr/i686-pc-linux-gnu/gcc-bin/3.3:/opt/Acrobat5:/usr/X11R6/bin:/opt/blackdown-jdk-1.4.1/bin:/opt/blackdown-jdk-1.4.1/jre/bin:/usr/qt/3/bin:/usr/kde/3.2/sbin:/usr/kde/3.2/bin:/opt/vmware/bin
distcc[27816] (dcc_scan_args) scanning arguments: gcc -qlanglvl=ansi -I. -I. -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -I/usr/include/mysql -mcpu=i686 -pipe -DHAVE_ERRNO_AS_DEFINE=1 -DUSE_OLD_FUNCTIONS -I/usr/include/libxml2 -Iinclude -I./include -I./ubiqx -I./smbwrapper -I. -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I. -c lib/snprintf.c -o lib/snprintf.o
distcc[27816] (dcc_scan_args) found input file "lib/snprintf.c"
distcc[27816] (dcc_scan_args) found object/output file "lib/snprintf.o"
distcc[27816] compile from snprintf.c to snprintf.o
distcc[27816] (dcc_get_hostlist) not reading /root/.distcc/hosts: No such file or directory
distcc[27816] (dcc_parse_hosts_file) load hosts from /etc/distcc/hosts
distcc[27816] (dcc_parse_hosts) found tcp token "athlonia/5"
distcc[27816] (dcc_lock_host) got cpu lock on athlonia/5 slot 0 as fd4
distcc[27816] (dcc_strip_dasho) result: gcc -qlanglvl=ansi -I. -I. -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -I/usr/include/mysql -mcpu=i686 -pipe -DHAVE_ERRNO_AS_DEFINE=1 -DUSE_OLD_FUNCTIONS -I/usr/include/libxml2 -Iinclude -I./include -I./ubiqx -I./smbwrapper -I. -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I. -c lib/snprintf.c
distcc[27816] (dcc_spawn_child) forking to execute: gcc -qlanglvl=ansi -I. -I. -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -I/usr/include/mysql -mcpu=i686 -pipe -DHAVE_ERRNO_AS_DEFINE=1 -DUSE_OLD_FUNCTIONS -I/usr/include/libxml2 -Iinclude -I./include -I./ubiqx -I./smbwrapper -I. -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I. -E lib/snprintf.c
distcc[27817] (dcc_increment_safeguard) setting safeguard: _DISTCC_SAFEGUARD=1
distcc[27816] (dcc_spawn_child) child started as pid27817
distcc[27816] (dcc_strip_local_args) result: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o
distcc[27816] exec on athlonia/5: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o
distcc[27816] (dcc_note_state) note state 2, file "snprintf.c", host "athlonia"
distcc[27816] (dcc_connect_timed) nonblocking connect to 10.0.0.96:3632
distcc[27816] (dcc_select_for_write) select for write on fd5
distcc[27816] (dcc_connect_by_addr) client got connection to athlonia port 3632 on fd5
distcc[27816] (dcc_x_token_int) send DIST00000001
distcc[27816] (dcc_x_token_int) send ARGC00000013
distcc[27816] (dcc_x_token_int) send ARGV00000003
distcc[27816] (dcc_x_token_int) send ARGV0000000e
distcc[27816] (dcc_x_token_int) send ARGV00000013
distcc[27816] (dcc_x_token_int) send ARGV00000003
distcc[27816] (dcc_x_token_int) send ARGV00000005
distcc[27816] (dcc_x_token_int) send ARGV00000005
distcc[27816] (dcc_x_token_int) send ARGV00000007
distcc[27816] (dcc_x_token_int) send ARGV00000014
distcc[27816] (dcc_x_token_int) send ARGV0000000b
distcc[27816] (dcc_x_token_int) send ARGV0000000e
distcc[27816] (dcc_x_token_int) send ARGV0000000c
distcc[27816] (dcc_x_token_int) send ARGV00000013
distcc[27816] (dcc_x_token_int) send ARGV0000001a
distcc[27816] (dcc_x_token_int) send ARGV0000000a
distcc[27816] (dcc_x_token_int) send ARGV00000005
distcc[27816] (dcc_x_token_int) send ARGV00000002
distcc[27816] (dcc_x_token_int) send ARGV0000000e
distcc[27816] (dcc_x_token_int) send ARGV00000002
distcc[27816] (dcc_x_token_int) send ARGV0000000e
distcc[27816] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)"
distcc[27816] (dcc_collect_child) cpp child 27817 terminated with status 0
distcc[27816] (dcc_collect_child) cpp times: user 0.061990s, system 0.010998s, 254 minflt, 545 majflt
distcc[27816] cpp on localhost completed ok
distcc[27816] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)"
distcc[27816] (dcc_x_file) send 62615 byte file /tmp/distcc_bb20772f.i with token DOTI
distcc[27816] (dcc_x_token_int) send DOTI0000f497
distcc[27816] (dcc_send_job) client finished sending request to server
distcc[27816] (dcc_note_state) note state 5, file "(NULL)", host "athlonia"
distcc[27816] (dcc_r_token_int) got DONE00000001
distcc[27816] (dcc_r_result_header) got response header
distcc[27816] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)"
distcc[27816] (dcc_r_token_int) got STAT00000100
distcc[27816] (dcc_r_token_int) got SERR0000256c
distcc[27816] (dcc_r_token_int) got SOUT00000000
distcc[27816] (dcc_r_token_int) got DOTO00000000
distcc[27816] (dcc_unlock) release lock fd4
distcc[27816] ERROR: compile on athlonia/5 failed
distcc[27816] elapsed compilation time 0.303556s
distcc[27816] (dcc_exit) exit: code 1; self: 0.002999 user 0.002999 sys; children: 0.061990 user 0.010998 sys

Log from athlonia: 

distccd[18752] (dcc_check_client) connection from 192.168.1.4:60714
distccd[18752] (dcc_r_token_int) got DIST00000001
distccd[18752] (dcc_r_token_int) got ARGC00000013
distccd[18752] (dcc_r_argv) reading 19 arguments from job submission
distccd[18752] (dcc_r_token_int) got ARGV00000003
distccd[18752] (dcc_r_argv) argv[0] = "gcc"
distccd[18752] (dcc_r_token_int) got ARGV0000000e
distccd[18752] (dcc_r_argv) argv[1] = "-qlanglvl=ansi"
distccd[18752] (dcc_r_token_int) got ARGV00000013
distccd[18752] (dcc_r_argv) argv[2] = "-march=athlon-tbird"
distccd[18752] (dcc_r_token_int) got ARGV00000003
distccd[18752] (dcc_r_argv) argv[3] = "-O3"
distccd[18752] (dcc_r_token_int) got ARGV00000005
distccd[18752] (dcc_r_argv) argv[4] = "-pipe"
distccd[18752] (dcc_r_token_int) got ARGV00000005
distccd[18752] (dcc_r_argv) argv[5] = "-mmmx"
distccd[18752] (dcc_r_token_int) got ARGV00000007
distccd[18752] (dcc_r_argv) argv[6] = "-m3dnow"
distccd[18752] (dcc_r_token_int) got ARGV00000014
distccd[18752] (dcc_r_argv) argv[7] = "-fomit-frame-pointer"
distccd[18752] (dcc_r_token_int) got ARGV0000000b
distccd[18752] (dcc_r_argv) argv[8] = "-ffast-math"
distccd[18752] (dcc_r_token_int) got ARGV0000000e
distccd[18752] (dcc_r_argv) argv[9] = "-funroll-loops"
distccd[18752] (dcc_r_token_int) got ARGV0000000c
distccd[18752] (dcc_r_argv) argv[10] = "-fforce-addr"
distccd[18752] (dcc_r_token_int) got ARGV00000013
distccd[18752] (dcc_r_argv) argv[11] = "-falign-functions=4"
distccd[18752] (dcc_r_token_int) got ARGV0000001a
distccd[18752] (dcc_r_argv) argv[12] = "-maccumulate-outgoing-args"
distccd[18752] (dcc_r_token_int) got ARGV0000000a
distccd[18752] (dcc_r_argv) argv[13] = "-mcpu=i686"
distccd[18752] (dcc_r_token_int) got ARGV00000005
distccd[18752] (dcc_r_argv) argv[14] = "-pipe"
distccd[18752] (dcc_r_token_int) got ARGV00000002
distccd[18752] (dcc_r_argv) argv[15] = "-c"
distccd[18752] (dcc_r_token_int) got ARGV0000000e
distccd[18752] (dcc_r_argv) argv[16] = "lib/snprintf.c"
distccd[18752] (dcc_r_token_int) got ARGV00000002
distccd[18752] (dcc_r_argv) argv[17] = "-o"
distccd[18752] (dcc_r_token_int) got ARGV0000000e
distccd[18752] (dcc_r_argv) argv[18] = "lib/snprintf.o"
distccd[18752] (dcc_r_argv) got arguments: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o
distccd[18752] (dcc_scan_args) scanning arguments: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c lib/snprintf.c -o lib/snprintf.o
distccd[18752] (dcc_scan_args) found input file "lib/snprintf.c"
distccd[18752] (dcc_scan_args) found object/output file "lib/snprintf.o"
distccd[18752] compile from snprintf.c to snprintf.o
distccd[18752] (dcc_run_job) output file lib/snprintf.o
distccd[18752] (dcc_input_tmpnam) input file lib/snprintf.c
distccd[18752] (dcc_r_token_int) got DOTI0000f497
distccd[18752] (dcc_r_file) received 62615 bytes to file /tmp/distccd_254e8503.i
distccd[18752] (dcc_r_file_timed) 62615 bytes received in 0.061983s, rate 987kB/s
distccd[18752] (dcc_set_input) changed input from "lib/snprintf.c" to "/tmp/distccd_254e8503.i"
distccd[18752] (dcc_set_input) command after: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c /tmp/distccd_254e8503.i -o lib/snprintf.o
distccd[18752] (dcc_set_output) changed output from "lib/snprintf.o" to "/tmp/distccd_22ce8503.o"
distccd[18752] (dcc_set_output) command after: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c /tmp/distccd_254e8503.i -o /tmp/distccd_22ce8503.o
distccd[18752] (dcc_check_compiler_masq) Warning: gcc on distccd's path is /usr/lib/distcc/bin/gcc and really a link to /usr/bin/distcc
distccd[18752] (dcc_spawn_child) forking to execute: gcc -qlanglvl=ansi -march=athlon-tbird -O3 -pipe -mmmx -m3dnow -fomit-frame-pointer -ffast-math -funroll-loops -fforce-addr -falign-functions=4 -maccumulate-outgoing-args -mcpu=i686 -pipe -c /tmp/distccd_254e8503.i -o /tmp/distccd_22ce8503.o
distccd[18752] (dcc_spawn_child) child started as pid18794
distccd[18794] (dcc_increment_safeguard) setting safeguard: _DISTCC_SAFEGUARD=1
distccd[18752] (dcc_collect_child) cc child 18794 terminated with status 0x100
distccd[18752] (dcc_collect_child) cc times: user 0.085000s, system 0.010000s, 687 minflt, 1005 majflt
distccd[18752] (dcc_x_token_int) send DONE00000001
distccd[18752] (dcc_x_token_int) send STAT00000100
distccd[18752] (dcc_x_file) send 9580 byte file /tmp/distcc_a07a8503.stderr with token SERR
distccd[18752] (dcc_x_token_int) send SERR0000256c
distccd[18752] (dcc_x_file) send 0 byte file /tmp/distcc_a1de8503.stdout with token SOUT
distccd[18752] (dcc_x_token_int) send SOUT00000000
distccd[18752] (dcc_x_token_int) send DOTO00000000
distccd[18752] gcc on localhost failed
distccd[18752] job complete
Comment 13 Martin Pool 2004-01-04 15:22:57 UTC
Your attachment 23155 [details] is just 22 bytes being "/tmp/distcc_bb20772f.i".  Is that *really* what was in the client's temporary file?
Comment 14 Martin Pool 2004-01-04 15:32:24 UTC
Thanks for getting the tcpdump.  Attachment 23153 [details] seems to show that from the point of view of the client's TCP stack, the data was sent out correctly.  From attachment:23154 we can see that the data was corrupt in the server's temporary file.

You don't need to worry about getting the client's tmpfile because the data was correct when it left the client.

It seems like the remaining possibilities are:

 - the data was corrupted in transit and not detected by the TCP stack

 - the data was corrupted by the server's kernel

 - distccd made a mistake when receiving the data

 - distccd scribbled over the data in the temporary file at some later point

It would be helpful to get a tcpdump recorded on the server, so that we can see whether it is coming in to that machine correctly.
Comment 15 Martin Pool 2004-01-04 15:53:58 UTC
A few more observations from comparing the tcpdump and the server's temporary file:

All of the errors in this report seem to be substitutions, rather than insertions or deletions.  In other words all the correct bytes are at the same offset.

All of the error runs are 8 bytes, which would be inconsistent with some software accidentally writing a uint64 or two ints into the buffer.  That might conceivably be either distcc or the kernel.

Also, there is a common pattern to the bytes that are written in, which perhaps supports the idea that they indicate the buffer being overwritten by some other value.

80 79 6d d7  c7 c6 6e a2
80 79 6d d7  c7 c6 6e a2
80 79 6d d7  c7 c6 6e a2

In fact in the dump we have here that seems to be the same value for every error.

That pattern doesn't look familiar to me.

What is your networking setup?  Are you using any kind of iptables or similar packetmangling software?

The errors are at offsets within the .i file of 

3085
7181
11277
15373
19469
23565
27661

So they start at a strange offset, but are evenly spaced every 4096 bytes!

Counting within the client's TCP stream, the first offset is 3554 which does not seem very meaningful either.
Comment 16 Martin Pool 2004-01-04 16:11:50 UTC
Let's see what code in distccd could be causing this:

distccd is not using mmap to receive the file, because it's less than 64kB.   It is not compressed.  Therefore it goes through the alternative path in dcc_r_bulk_plain, which basically reads the whole thing into a big mallocd buffer and then writes it out again.

I don't see any buffer sizes or steps of 4kB, except that of course that is the kernel page size.
Comment 17 jan kuipers 2004-01-04 16:18:29 UTC
Sorry for the missing/22byte file from the client, I should have checked if it actually uploaded the file.

The tcpdump was done on the server (10.0.0.96), so I assume the transmission over the wire was succesful and my network setup hasn't effected it.
Comment 18 jan kuipers 2004-01-04 16:21:58 UTC
Created attachment 23170 [details]
source on sender
Comment 19 Martin Pool 2004-01-04 16:27:15 UTC
OK, some things to do:

Could you try running the daemon under valgrind?

Could you get a tcpdump recorded on the server machine?

Thanks for your help.
Comment 20 Martin Pool 2004-01-04 16:32:28 UTC
Oh, OK.  I misunderstood your previous comment about the tcpdump.

I guess now we're just down to a few options

 1- run distccd under valgrind to try to find a memory corruption bug that's scribbling over the buffer

 2- run distccd under strace with -s 65536 to see what it's getting from the network and writing to the file

 3- try stock 2.4.23 kernels if possible.
Comment 21 jan kuipers 2004-01-06 09:18:43 UTC
When running distccd under valgrind with various options, there is no sign of corruption, and all compiles fine.

When running distccd strace'd, i'm getting getting various results, depending on the options for strace: more overhead seems to increase the chance of a succesful compile so I suppose this would indicate a race condition somewhere.


Due to my logentries like "distcc[21143] (dcc_unlock) release lock fd4" i take a wild stab and assume that fd4 is the network socket and did a '
strace -e read -s 65535 ./distccd --wizard >& /tmp/distccd' which got this:


read(4, "__rawmemchr (__retval, __reject) : strchr (__retval, __reject)))) != ((void *)0))\n    *(*__s)++ = \'\\0\';\n  return __retval;\n}\n\nextern __inline char *__strsep_2c (char **__s, char __reject1, char __reject2);\nextern __inline char *\n__strsep_2c (char **__s, char __reject1, char __reject2)\n{\n  register char *__retval = *__s;\n  if (__retval != ((void *)0))\n    {\n      register char *__cp = __retval;\n      while (1)\n        {\n          if (*__cp == \'\\0\')\n            {\n              __cp = ((void *)0);\n          break;\n            }\n          if (*__cp == __reject1 || *__cp == __reject2)\n            {\n              *__cp++ = \'\\0\';\n              break;\n     \0\0\0\0\0\0\0\0\n          ++__cp;\n        }\n      *__s = __cp;\n    }\n  return __retval;\n}\n\nextern __inline char *__strsep_3c (char **__s, char __reject1, char __reject2,\n                                   char __reject3);\nextern __inline char *\n__strsep_3c (char **__s, char __reject1, char __reject2, char __reject3)\n{\n  register char *__retval = *__s;\n  if (__retval != ((void *)0))\n    {\n      register char *__cp = __retval;\n      while (1)\n        {\n          if (*__cp == \'\\0\')\n            {\n              __cp = ((void *)0);\n          break;\n            }\n          if (*__cp == __reject1 || *__cp == __reject2 || *__cp == __reject3)\n            {\n              *__cp++ = \'\\0\';\n              break;\n            }\n          ++__cp;\n        }\n      *__s = __cp;\n    }\n  return __retval;\n}\n# 12", 48604) = 1448


But the payload according to my simultanious tcpdump-capture on the server says it should be: 


0000  00 60 97 d6 e0 d9 b4 66  c2 68 98 81 08 00 45 00   .`.
Comment 22 jan kuipers 2004-01-06 09:18:43 UTC
When running distccd under valgrind with various options, there is no sign of corruption, and all compiles fine.

When running distccd strace'd, i'm getting getting various results, depending on the options for strace: more overhead seems to increase the chance of a succesful compile so I suppose this would indicate a race condition somewhere.


Due to my logentries like "distcc[21143] (dcc_unlock) release lock fd4" i take a wild stab and assume that fd4 is the network socket and did a '
strace -e read -s 65535 ./distccd --wizard >& /tmp/distccd' which got this:


read(4, "__rawmemchr (__retval, __reject) : strchr (__retval, __reject)))) != ((void *)0))\n    *(*__s)++ = \'\\0\';\n  return __retval;\n}\n\nextern __inline char *__strsep_2c (char **__s, char __reject1, char __reject2);\nextern __inline char *\n__strsep_2c (char **__s, char __reject1, char __reject2)\n{\n  register char *__retval = *__s;\n  if (__retval != ((void *)0))\n    {\n      register char *__cp = __retval;\n      while (1)\n        {\n          if (*__cp == \'\\0\')\n            {\n              __cp = ((void *)0);\n          break;\n            }\n          if (*__cp == __reject1 || *__cp == __reject2)\n            {\n              *__cp++ = \'\\0\';\n              break;\n     \0\0\0\0\0\0\0\0\n          ++__cp;\n        }\n      *__s = __cp;\n    }\n  return __retval;\n}\n\nextern __inline char *__strsep_3c (char **__s, char __reject1, char __reject2,\n                                   char __reject3);\nextern __inline char *\n__strsep_3c (char **__s, char __reject1, char __reject2, char __reject3)\n{\n  register char *__retval = *__s;\n  if (__retval != ((void *)0))\n    {\n      register char *__cp = __retval;\n      while (1)\n        {\n          if (*__cp == \'\\0\')\n            {\n              __cp = ((void *)0);\n          break;\n            }\n          if (*__cp == __reject1 || *__cp == __reject2 || *__cp == __reject3)\n            {\n              *__cp++ = \'\\0\';\n              break;\n            }\n          ++__cp;\n        }\n      *__s = __cp;\n    }\n  return __retval;\n}\n# 12", 48604) = 1448


But the payload according to my simultanious tcpdump-capture on the server says it should be: 


0000  00 60 97 d6 e0 d9 b4 66  c2 68 98 81 08 00 45 00   .`.ÖàÙ´f Âh....E.
0010  05 dc 9d aa 40 00 3f 06  cc 65 c0 a8 01 04 0a 00   .Ü.ª@.?. ÌeÀ¨....
0020  00 60 8b 28 0e 30 64 fd  b7 4c 41 40 e0 6d 80 10   .`.(.0dý ·LA@àm..
0030  16 d0 ee cc 00 00 01 01  08 0a 33 b8 e3 a2 17 db   .ÐîÌ.... ..3¸ã¢.Û
0040  66 67 5f 5f 72 61 77 6d  65 6d 63 68 72 20 28 5f   fg__rawm emchr (_
0050  5f 72 65 74 76 61 6c 2c  20 5f 5f 72 65 6a 65 63   _retval,  __rejec
0060  74 29 20 3a 20 73 74 72  63 68 72 20 28 5f 5f 72   t) : str chr (__r
0070  65 74 76 61 6c 2c 20 5f  5f 72 65 6a 65 63 74 29   etval, _ _reject)
0080  29 29 29 20 21 3d 20 28  28 76 6f 69 64 20 2a 29   ))) != ( (void *)
0090  30 29 29 0a 20 20 20 20  2a 28 2a 5f 5f 73 29 2b   0)).     *(*__s)+
00a0  2b 20 3d 20 27 5c 30 27  3b 0a 20 20 72 65 74 75   + = '\0' ;.  retu
00b0  72 6e 20 5f 5f 72 65 74  76 61 6c 3b 0a 7d 0a 0a   rn __ret val;.}..
00c0  65 78 74 65 72 6e 20 5f  5f 69 6e 6c 69 6e 65 20   extern _ _inline 
00d0  63 68 61 72 20 2a 5f 5f  73 74 72 73 65 70 5f 32   char *__ strsep_2
00e0  63 20 28 63 68 61 72 20  2a 2a 5f 5f 73 2c 20 63   c (char  **__s, c
00f0  68 61 72 20 5f 5f 72 65  6a 65 63 74 31 2c 20 63   har __re ject1, c
0100  68 61 72 20 5f 5f 72 65  6a 65 63 74 32 29 3b 0a   har __re ject2);.
0110  65 78 74 65 72 6e 20 5f  5f 69 6e 6c 69 6e 65 20   extern _ _inline 
0120  63 68 61 72 20 2a 0a 5f  5f 73 74 72 73 65 70 5f   char *._ _strsep_
0130  32 63 20 28 63 68 61 72  20 2a 2a 5f 5f 73 2c 20   2c (char  **__s, 
0140  63 68 61 72 20 5f 5f 72  65 6a 65 63 74 31 2c 20   char __r eject1, 
0150  63 68 61 72 20 5f 5f 72  65 6a 65 63 74 32 29 0a   char __r eject2).
0160  7b 0a 20 20 72 65 67 69  73 74 65 72 20 63 68 61   {.  regi ster cha
0170  72 20 2a 5f 5f 72 65 74  76 61 6c 20 3d 20 2a 5f   r *__ret val = *_
0180  5f 73 3b 0a 20 20 69 66  20 28 5f 5f 72 65 74 76   _s;.  if  (__retv
0190  61 6c 20 21 3d 20 28 28  76 6f 69 64 20 2a 29 30   al != (( void *)0
01a0  29 29 0a 20 20 20 20 7b  0a 20 20 20 20 20 20 72   )).    { .      r
01b0  65 67 69 73 74 65 72 20  63 68 61 72 20 2a 5f 5f   egister  char *__
01c0  63 70 20 3d 20 5f 5f 72  65 74 76 61 6c 3b 0a 20   cp = __r etval;. 
01d0  20 20 20 20 20 77 68 69  6c 65 20 28 31 29 0a 20        whi le (1). 
01e0  20 20 20 20 20 20 20 7b  0a 20 20 20 20 20 20 20          { .       
01f0  20 20 20 69 66 20 28 2a  5f 5f 63 70 20 3d 3d 20      if (* __cp == 
0200  27 5c 30 27 29 0a 20 20  20 20 20 20 20 20 20 20   '\0').           
0210  20 20 7b 0a 20 20 20 20  20 20 20 20 20 20 20 20     {.             
0220  20 20 5f 5f 63 70 20 3d  20 28 28 76 6f 69 64 20     __cp =  ((void 
0230  2a 29 30 29 3b 0a 20 20  20 20 20 20 20 20 20 20   *)0);.           
0240  62 72 65 61 6b 3b 0a 20  20 20 20 20 20 20 20 20   break;.          
0250  20 20 20 7d 0a 20 20 20  20 20 20 20 20 20 20 69      }.           i
0260  66 20 28 2a 5f 5f 63 70  20 3d 3d 20 5f 5f 72 65   f (*__cp  == __re
0270  6a 65 63 74 31 20 7c 7c  20 2a 5f 5f 63 70 20 3d   ject1 ||  *__cp =
0280  3d 20 5f 5f 72 65 6a 65  63 74 32 29 0a 20 20 20   = __reje ct2).   
0290  20 20 20 20 20 20 20 20  20 7b 0a 20 20 20 20 20             {.     
02a0  20 20 20 20 20 20 20 20  20 2a 5f 5f 63 70 2b 2b             *__cp++
02b0  20 3d 20 27 5c 30 27 3b  0a 20 20 20 20 20 20 20    = '\0'; .       
02c0  20 20 20 20 20 20 20 62  72 65 61 6b 3b 0a 20 20          b reak;.  
02d0  20 20 20 20 20 20 20 20  20 20 7d 0a 20 20 20 20              }.    
02e0  20 20 20 20 20 20 2b 2b  5f 5f 63 70 3b 0a 20 20         ++ __cp;.  
02f0  20 20 20 20 20 20 7d 0a  20 20 20 20 20 20 2a 5f         }.       *_
0300  5f 73 20 3d 20 5f 5f 63  70 3b 0a 20 20 20 20 7d   _s = __c p;.    }
0310  0a 20 20 72 65 74 75 72  6e 20 5f 5f 72 65 74 76   .  retur n __retv
0320  61 6c 3b 0a 7d 0a 0a 65  78 74 65 72 6e 20 5f 5f   al;.}..e xtern __
0330  69 6e 6c 69 6e 65 20 63  68 61 72 20 2a 5f 5f 73   inline c har *__s
0340  74 72 73 65 70 5f 33 63  20 28 63 68 61 72 20 2a   trsep_3c  (char *
0350  2a 5f 5f 73 2c 20 63 68  61 72 20 5f 5f 72 65 6a   *__s, ch ar __rej
0360  65 63 74 31 2c 20 63 68  61 72 20 5f 5f 72 65 6a   ect1, ch ar __rej
0370  65 63 74 32 2c 0a 20 20  20 20 20 20 20 20 20 20   ect2,.           
0380  20 20 20 20 20 20 20 20  20 20 20 20 20 20 20 20                    
0390  20 20 20 20 20 20 20 20  20 63 68 61 72 20 5f 5f             char __
03a0  72 65 6a 65 63 74 33 29  3b 0a 65 78 74 65 72 6e   reject3) ;.extern
03b0  20 5f 5f 69 6e 6c 69 6e  65 20 63 68 61 72 20 2a    __inlin e char *
03c0  0a 5f 5f 73 74 72 73 65  70 5f 33 63 20 28 63 68   .__strse p_3c (ch
03d0  61 72 20 2a 2a 5f 5f 73  2c 20 63 68 61 72 20 5f   ar **__s , char _
03e0  5f 72 65 6a 65 63 74 31  2c 20 63 68 61 72 20 5f   _reject1 , char _
03f0  5f 72 65 6a 65 63 74 32  2c 20 63 68 61 72 20 5f   _reject2 , char _
0400  5f 72 65 6a 65 63 74 33  29 0a 7b 0a 20 20 72 65   _reject3 ).{.  re
0410  67 69 73 74 65 72 20 63  68 61 72 20 2a 5f 5f 72   gister c har *__r
0420  65 74 76 61 6c 20 3d 20  2a 5f 5f 73 3b 0a 20 20   etval =  *__s;.  
0430  69 66 20 28 5f 5f 72 65  74 76 61 6c 20 21 3d 20   if (__re tval != 
0440  28 28 76 6f 69 64 20 2a  29 30 29 29 0a 20 20 20   ((void * )0)).   
0450  20 7b 0a 20 20 20 20 20  20 72 65 67 69 73 74 65    {.       registe
0460  72 20 63 68 61 72 20 2a  5f 5f 63 70 20 3d 20 5f   r char * __cp = _
0470  5f 72 65 74 76 61 6c 3b  0a 20 20 20 20 20 20 77   _retval; .      w
0480  68 69 6c 65 20 28 31 29  0a 20 20 20 20 20 20 20   hile (1) .       
0490  20 7b 0a 20 20 20 20 20  20 20 20 20 20 69 66 20    {.           if 
04a0  28 2a 5f 5f 63 70 20 3d  3d 20 27 5c 30 27 29 0a   (*__cp = = '\0').
04b0  20 20 20 20 20 20 20 20  20 20 20 20 7b 0a 20 20                {.  
04c0  20 20 20 20 20 20 20 20  20 20 20 20 5f 5f 63 70                __cp
04d0  20 3d 20 28 28 76 6f 69  64 20 2a 29 30 29 3b 0a    = ((voi d *)0);.
04e0  20 20 20 20 20 20 20 20  20 20 62 72 65 61 6b 3b              break;
04f0  0a 20 20 20 20 20 20 20  20 20 20 20 20 7d 0a 20   .             }. 
0500  20 20 20 20 20 20 20 20  20 69 66 20 28 2a 5f 5f             if (*__
0510  63 70 20 3d 3d 20 5f 5f  72 65 6a 65 63 74 31 20   cp == __ reject1 
0520  7c 7c 20 2a 5f 5f 63 70  20 3d 3d 20 5f 5f 72 65   || *__cp  == __re
0530  6a 65 63 74 32 20 7c 7c  20 2a 5f 5f 63 70 20 3d   ject2 ||  *__cp =
0540  3d 20 5f 5f 72 65 6a 65  63 74 33 29 0a 20 20 20   = __reje ct3).   
0550  20 20 20 20 20 20 20 20  20 7b 0a 20 20 20 20 20             {.     
0560  20 20 20 20 20 20 20 20  20 2a 5f 5f 63 70 2b 2b             *__cp++
0570  20 3d 20 27 5c 30 27 3b  0a 20 20 20 20 20 20 20    = '\0'; .       
0580  20 20 20 20 20 20 20 62  72 65 61 6b 3b 0a 20 20          b reak;.  
0590  20 20 20 20 20 20 20 20  20 20 7d 0a 20 20 20 20              }.    
05a0  20 20 20 20 20 20 2b 2b  5f 5f 63 70 3b 0a 20 20         ++ __cp;.  
05b0  20 20 20 20 20 20 7d 0a  20 20 20 20 20 20 2a 5f         }.       *_
05c0  5f 73 20 3d 20 5f 5f 63  70 3b 0a 20 20 20 20 7d   _s = __c p;.    }
05d0  0a 20 20 72 65 74 75 72  6e 20 5f 5f 72 65 74 76   .  retur n __retv
05e0  61 6c 3b 0a 7d 0a 23 20  31 32                     al;.}.#  12      


Which shows an 8 byte difference where strace says 8x \00 and tcpdump 8x \20.

I'll have a stab at trying to recreate these results with a newer kernel later today.
Comment 23 Martin Pool 2004-01-06 16:28:22 UTC
Thanks for the additional info.

That log message refers to fd4 being used for the lock file in the client, but the fd assignments are of course different in distccd.  So I assume you do in fact have the network stream in that strace line.

It certainly does look like some 0x20 (space) characters in the network stream are being transmuted into 0x00 when read by distccd.  That is pretty wierd.  Assuming the traces are correct, I think you have a kernel bug on the server (athlonia?), or perhaps a very wierd hardware bug.

Is it still running 2.4.22-gentoo-r1?

Your first step should be to try kernel.org 2.4.24 or the Gentoo .24.  Try to work out if it's only present in the Gentoo deltas; if so cc the gentoo kernel maintainer.
Comment 24 Kenneth Rawlings 2004-01-07 20:58:18 UTC
I am having similar problems.  Both machines I am getting distcc up and running on have the exact same motherboard, memory, cpu, CFLAGS, distcc ver, gcc ver.  The CFLAGS are the default CFLAGS that come on the athlon live cd.

software used:

gentoo-sources-2.4.22-r3
gcc-3.2.3-r3
distcc-2.11.1

I have this problem whether I am compiling a C++ or C ebuild.  I can set distcc to just use localhost and I still have problems.  The second I take "distcc" out of FEATURES in /etc/make.conf all my problems go away (well, except for the slow compiles :-) ).

Both machines have been rock solid for months and have *never* had an emerge failure.  I'm not using ~x86 either.

If you need any more information I can try and accomodate your requests.

I really would like to see this working.  I am going to try an older distcc version to see if it fixes anything.
Comment 25 Martin Pool 2004-01-08 16:18:55 UTC
Kenneth, thanks for your report.

What kernel are you using?  

When you say "I can set distcc to just use localhost and I still have problems", do you really mean setting it to "localhost", or setting it to 127.0.0.1?  If the former then it's almost certainly a different problem, since distcc just invokes the compiler directly, and the network protocol is not involved.  

Since this problem seems to only be occurring on Gentoo it seems likely that it is a Gentoo kernel problem.  If you want to try something else the most useful thing would be a kernel.org kernel.

Comment 26 Kenneth Rawlings 2004-01-08 20:57:08 UTC
I am using gentoo-sources-2.4.22-r3 for my kernel.  I have tried (because of your input) both localhost and 127.0.0.1 .  The problem occurs with either setting.  I went ahead and tried distcc-2.9 without success also.  I am going to try a couple of different vanilla kernels (vanilla-sources-2.4.20 and vanilla-sources-2.4.24) and see if there is any success.  I must say I'm a bit skeptical on the kernel being the issue as I have had no file/network io corruption in other applications.  I am ready to be surprised though :-P

I would also like to add that despite the errors being mostly random, one of the packages I was using to test out a distcc emerge with would continually break in the same place (alas, I can't remember the package, but I will add a note if I can remember it).
Comment 27 Martin Pool 2004-01-08 21:17:19 UTC
If you can really reproduce this with DISTCC_HOSTS=localhost, then please post a verbose client log of such a failure, plus the compiler error messages.
Comment 28 jan kuipers 2004-01-09 14:08:03 UTC
I'm not having any problems with vanilla linux-2.4.25-pre4 or the gentoo-dev-sources-2.6.0 sources, so my bet is that (at least in my instance) the 2.4.22+gentoo patches is triggering something.  

Thanks Martin for pushing me in the right direction, I had a several hours of fun trying out new software =) 
Comment 29 Lisa Seelye (RETIRED) gentoo-dev 2004-01-09 14:13:25 UTC
Can you kernel guys point us in the right place?
Comment 30 Brian Jackson (RETIRED) gentoo-dev 2004-01-09 14:29:19 UTC
What nic/drivers are being used for the network between hosts? preferrably info about both ends.
Comment 31 Martin Pool 2004-01-09 20:40:29 UTC
For the information of the kernel people:

As you can see earlier in the report, the data seems to be leaving the client correctly.  The problem is somewhere on the read end.

The code here is pretty straightforward: just a big old read(2) loop to pull the data into a memory buffer then write it to disk.  The two main unusual factors are that distcc uses TCP_CORK to form big packets, and that it does very fast network IO compared to other programs.  (It found a bug in 2.5 that only network benchmark programs could reproduce because it flooded so much traffic through.)

The problem seems to be that the data is corrupt before it is returned from read.  However, tcpdump seems to see the data on the receiver correctly.  So perhaps it's somewhere in the TCP stack after the network driver.
Comment 32 Kenneth Rawlings 2004-01-10 10:36:23 UTC
Just as further confirmation, I am now using vanilla 2.4.24 and everything is working great.
Comment 33 jan kuipers 2004-01-11 11:59:08 UTC
Using gentoo-sources-2.4.22-r4 i'm able to recreate the faulty behaviour.
Using vanilla-2.4.22 it works flawless.

server and client use the same nic: Ethernet controller: 3Com Corporation 3c905 100BaseTX [Boomerang] (rev 0). using the vortex drivers




Comment 34 jan kuipers 2004-01-12 10:54:52 UTC
using vanilla-2.4.22 + the 036_fast-csum patch i'm able to recreate the faulty behavior.

Seeing as this patch 'zeroes' out stuff (according to the comments anyway) I assume there's an off by one error there at times, but as its been 16 years since i did assembler on a m68k who knows =) 

The pertinent part of dmesg: 

Measuring network checksumming speed
   basic     :   313.600 MB/sec
   simple    :   236.800 MB/sec
   3Dnow!    :   659.200 MB/sec
   AMD-MMX   :   659.200 MB/sec
func SSE1+ skipped: not supported by CPU
csum: using csum function: 3Dnow!
   basic     :   230.400 MB/sec
   simple    :   211.200 MB/sec
   AMD-MMX   :   268.800 MB/sec
func SSE1+ skipped: not supported by CPU
func SSE1 skipped: not supported by CPU
csum: using csum_copy function: AMD-MMX

Comment 35 jan kuipers 2004-01-12 10:59:01 UTC
athlonia root # cat /proc/cpuinfo
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 1
model name      : AMD-K7(tm) Processor
stepping        : 2
cpu MHz         : 704.964
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat mmx syscall mmxext 3dnowext 3dnow
bogomips        : 1405.74
Comment 36 jan kuipers 2004-01-14 04:14:49 UTC
gentoo-sources-2.4.22-r4 minus the 036_fast-csum patch does not show the faulty behaviour. 
Comment 37 Brian Jackson (RETIRED) gentoo-dev 2004-01-14 08:49:59 UTC
thanks for testing that out for us, I'm preparing to relase a -r5, it'll be fixed.
Comment 38 Lisa Seelye (RETIRED) gentoo-dev 2004-01-14 09:02:56 UTC
kernel bug.

you guys can have it and close it. :)
Comment 39 Brian Jackson (RETIRED) gentoo-dev 2004-01-14 09:41:47 UTC
-r5 (just added to cvs, so give it a few mins) is fixed, update when it shows up