| Summary: | MLdonkey 2.6.4 keeps crashing without feedback | ||
|---|---|---|---|
| Product: | Gentoo Linux | Reporter: | Master One <MasterOne> |
| Component: | Current packages | Assignee: | Gentoo net-p2p team <net-p2p> |
| Status: | RESOLVED FIXED | ||
| Severity: | critical | CC: | gentoo, ikelos, netz |
| Priority: | High | ||
| Version: | unspecified | ||
| Hardware: | x86 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Package list: | Runtime testing required: | --- | |
|
Description
Master One
2005-09-12 00:17:29 UTC
you've written, you've lowered nice level from 19 to 0 and crashes become rarely. well - in fact you've increase mlnet priority (-20 is the highest and 20 is the lowest one). it may mean, that mlnet dies, when it has not sufficient amount of cpu activity (which ofcourse shouldn't happen, but...). could you try to turn off the CPU governor, so the machine runs always with it's default 2. 4GHz and let us know it that change anything? @Marcin Kryczek It may be worth a try, but I don't really think that has something to do with it. The reason is, that machine stays at 300 MHz most of the time, and mlnet then consumes only about 10% CPU and 6% MEM. The ondemand trigger is set to 80%, and as soon as that value is reached, the CPU immediately goes up till 2.4 GHz, so there is nothing maxing out the CPU power at any given time. On the other hand, those crashes just seem to appear totally randomly. ATM the core shows an uptime of 3.5 hours, with the CPU frequency staying at 300 MHz and 15 BT downloads. After the next crash occures, I will set the CPU govenor to "performance", to see what happens then. I _suppose_ I'm getting mldonkey crashes, too. I say 'suppose', since the error I'm experiencing is system hang on shutdown while 'Stopping service mldonkey' and a leftover mlnet.pid. Could this be related to bug #103433? I'll try to witness such a crash, right now it's running and I can stop it without problems with /etc/init.d/mldonkey stop. (In reply to comment #0) > distcc 2.18.3 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) > MAKEOPTS="-j5" I had some problems on Solaris with distcc, make -j5 and Ocaml applications. Try compiling Ocaml and MLDonkey without distcc and with make -j1. Maybe it helps. Both the versions in portage and the precompiled cores from http://download.berlios.de/pub/mldonkey/spiralvoice/ crash. The 2.6.4 precomp core logs 2005/09/13 22:53:39 [cF] Checksum computation failed: Exception: os_read failed: Input/output error before dying. @Daniel Vianna
That has to be another problem, because my issue does not result in any error
message.
In the meantime, I tried some different things:
- Recompiled ocaml 3.08.3 and mldoney 2.6.4 with the following settings:
CFLAGS="-O1 -march=pentium4 -pipe -fomit-frame-pointer"
MAKEOPTS="-j1"
FEATURES="-ccache -distcc"
- Added the following system settings:
/etc/security/limits.conf
* soft nproc 4096
* hard nproc 16384
* soft nofile 4096
* hard nofile 65536
/etc/sysctl.conf
kernel.shmall = 2097152
kernel.shmmax = 2147483648
kernel.shmmni = 4096
kernel.sem = 250 32000 100 128
fs.file-max = 65536
I don't know, if any of these measures helped, but it seems to be more stable
again. The actual uptime of the core is one day, before that it was about 9
hours (then it crashed again after adding some new torrents).
BTW Since the upgrade to 2.6.4, I (again) have the problem with those
phantom-commits. When a file-download is finished, commited and moved from the
incoming-folder to the final destination, files with the same name and a size
of 0 KB keep showing up in the incoming-folder. No idea what's that all
about...
BTW Since last month there is the new ocaml version 3.08.4, which seems to be a bugfix-release. Any idea, why that one is still not in portage? It may be an idea, to reemerge mldonkey with ocaml 3.08.4 installed. I forgot to mention, that I have set the cpufreq-govenor to "performance" since the last crash, so maybe all the other settings have no influence at all, and it was all about the P4 frequency throttling. I will do some more test with the ondemand govenor, as soon as I find the time (I really would like to have that working, the ondemand govenor works really well for all the other stuff, and why let that machine run on 2.4 GHz 24/7, if it also can operate at only 300 MHz, when load is low). I think the problem is solved: It was indeed the "ondemand" CPU govenor! I have reversed the mentioned system changes, updated to ocaml 3.08.4 and mldonkey 2.6.4-r1 (both compiled with my systemwide standardsettings), and switched to the "performance" CPU govenor. Since that, mlnet runs without interruption for days without crash. Because I used the "ondemand" CPU govenor for quite some time, and it did not cause any problems at the beginning, I think, that something changed with one of the last kernel-upgrades. The only remaining problem is now, that I still get phantom-files with a size of 0 kb in the incoming folder after a commit. That's not really tragical, but nevertheless annoying. *** Bug 111326 has been marked as a duplicate of this bug. *** Reopen wrt Bug 111326. It crashes as hell. It's impossible to use any mlnet >=2.6.5. Maybe they should be masked. 2.6.4-r2 works... well, fine. I don't use any cpufreq program. Portage 2.0.53_rc7 (default-linux/x86/2005.0, gcc-3.4.4, glibc-2.3.5-r3, 2.6.14-gentoo i686) ================================================================= System uname: 2.6.14-gentoo i686 AMD Athlon(TM) XP 1800+ Gentoo Base System version 1.12.0_pre9 ccache version 2.4 [enabled] dev-lang/python: 2.4.2 sys-apps/sandbox: 1.2.13 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r1 sys-devel/binutils: 2.16.1 sys-devel/libtool: 1.5.20-r1 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS="x86 ~x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-march=athlon-xp -mmmx -m3dnow -msse -mfpmath=sse,387 -ffast-math -O2 -fomit-frame-pointer -frename-registers -funroll-loops -pipe" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/kde/3/share/config /usr/lib/X11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-march=athlon-xp -mmmx -m3dnow -msse -mfpmath=sse,387 -ffast-math -O2 -fomit-frame-pointer -frename-registers -funroll-loops -pipe" DISTDIR="/usr/portage/distfiles" FEATURES="autoconfig ccache distlocks sandbox sfperms strict" GENTOO_MIRRORS="http://linuv.uv.es/mirror/gentoo/ http://www.caliu.info/pub/gentoo/" LANG="es_ES.UTF-8" LC_ALL="es_ES.UTF-8" LDFLAGS="-Wl,-O1" LINGUAS="es" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="x86 16bit 3dnow 3dnowext 7zip S3TC X a52 aac aalib acpi alsa apache2 audiofile bash-completion berkdb bidi bzip2 cairo cddb cdparanoia cdr chroot cjk clock-screen crypt cscope css cups curl dba dbus dlloader dts dvd dvdr dvdread dynagraph ecc edl eds emboss erandom exif faac faad fam fbcon ffmpeg flac font-server fontconfig foomaticdb foreign-sysvinit ftp gd gdbm gif gimpprint glibc-omitfp glitz gpm graphviz gs gtk2 hal hardened hpn icecast iconv idn imagemagick imlib imlib2 immqt-bc ipv6 irmc ithreads jabber java javascript jbig jce jikes jpeg jpeg2k justify kde kdeenablefinal lcms libcaca libg++ libwww linguas_es live lm_sensors logitech-mouse logrotate lzo lzw-tiff mad matroska md5sum mikmod mmap mmx mmxext mng monkey moznocompose moznoirc moznomail mozsvg mp3 mpeg mpeg4 mpi mplayer msn musepack musicbrainz mysql mysqli ncurses network nls no-old-linux no_wxgtk1 nomac nomalloccheck nomotif nptl nptlonly ogg oggvorbis openexr opengl pam pdflib perl pic png ppds python qt quicktime rdesktop readline rtc ruby sftplogging slp speex spell sse ssl stencil-buffer svg symlink tcpd tga theora threads tiff toolbar truetype truetype-fonts udev unicode urandom usb userlocales utf8 vcd vhosts vim-with-x visualization vorbis win32codecs wmf xine xml2 xpm xprint xrandr xscreensaver xv xvid yv12 zeroconf zip zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, MAKEOPTS CFLAGS="... -fomit-frame-pointer ..." see bug #111626 for more details. ok, it seems my old problem was caused by parallel shutdown in /etc/conf.d/rc, will search or submit this as another bug marking as closed |