AMD Athlon64 3000+, Newcastle core NForce4 Ultra board 2 x 512MB KingstoneVR PC3200 DDR 4 x Seagate SATA drives dctc (0.85.9) hangs with 100% "system" CPU usage on my server. It has been running without a problem for 24.5 days (uptime of server). It has beening started by cron and killed by cron four times a day and everything worked fine. After 24.5 days of server uptime it started to use 100% of CPU time, failed to connect to the local HUB and even failed to start dctc_master process (wich usually starts shortly after dctc). Endless killing and starting, deleting of ~/.dctc directory, running as different user, running as a root, reemerging didn't show any progress. I've tried to compile older versions of dctc (0.85.6 and 0.83.8) manualy but they gave the same result. Here are the details: http://forums.gentoo.org/viewtopic-t-342985.html I've run "strace dctc..." and in few seconds got 200MB log file filed with: semget(1142054670, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054671, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054672, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054673, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054674, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054675, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054676, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054677, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054678, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054679, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054680, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054681, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054682, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054683, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054684, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) semget(1142054685, 10, IPC_CREAT|IPC_EXCL|0600) = -1 ENOSPC (No space left on device) So, it seems dctc can't get some semaphore... Don't know anything more. All questions and suggestions are welcome! Portage 2.0.51.19 (default-linux/amd64/2005.0, gcc-3.4.3-20050110, glibc-2.3.4.20041102-r1, 2.6.11-gentoo-r6 x86_64) ================================================================= System uname: 2.6.11-gentoo-r6 x86_64 AMD Athlon(tm) 64 Processor 3000+ Gentoo Base System version 1.4.16 Python: dev-lang/python-2.3.5 [2.3.5 (#1, May 12 2005, 02:04:49)] distcc 2.18.3 x86_64-pc-linux-gnu (protocols 1 and 2) (default port 3632) [disabled] dev-lang/python: 2.3.5 sys-apps/sandbox: [Not Present] sys-devel/autoconf: 2.59-r6, 2.13 sys-devel/automake: 1.7.9-r1, 1.8.5-r3, 1.5, 1.4_p6, 1.6.3, 1.9.5 sys-devel/binutils: 2.15.92.0.2-r10 sys-devel/libtool: 1.5.16 virtual/os-headers: 2.6.8.1-r4 ACCEPT_KEYWORDS="amd64" AUTOCLEAN="yes" CFLAGS="-O2 -march=athlon64 -pipe -frename-registers -fweb -fomit-frame-pointer" CHOST="x86_64-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/2/share/config /usr/kde/3/share/config /usr/share/config /var/bind /var/qmail/control" CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS="-O2 -pipe" DISTDIR="/users/tnt/distfiles" FEATURES="autoaddcvs autoconfig ccache distlocks sandbox sfperms strict" GENTOO_MIRRORS="ftp://mirror.etf.bg.ac.yu/gentoo/ http://ftp-stud.fht-esslingen.de/pub/Mirrors/gentoo/ http://gd.tuwien.ac.at/opsys/linux/gentoo/" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="amd64 acpi apache2 berkdb bitmap-fonts crypt cups curl encode exif extensions font-server fortran gd gif gpm imagemagick imap jabber jp2 jpeg libwww logrotate lzw lzw-tiff maildir mp3 mpeg mysql ncurses nls nptl nptlonly oggvorbis pam pam-mysql perl php png python readline rrdtool samba sasl slang snmp ssl tcpd tiff truetype truetype-fonts type1-fonts unicode usb userlocales wmf xml2 xpm xrandr zlib userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CBUILD, CTARGET, LANG, LC_ALL, LDFLAGS, LINGUAS
Uhm... >(No space left on device) obviously suggest that you are out of disk space...
(In reply to comment #1) > Uhm... > > >(No space left on device) > > obviously suggest that you are out of disk space... Partition with the least of space available has 409MB of free space: titan root # df -h Filesystem Size Used Avail Use% Mounted on /dev/sda3 4.6G 3.2G 1.5G 68% / /dev/sda2 4.2G 507M 3.7G 12% /var/cache/squid/cache00 /dev/sdb2 4.2G 506M 3.7G 12% /var/cache/squid/cache01 /dev/sdc2 4.2G 507M 3.7G 12% /var/cache/squid/cache02 /dev/sdd2 4.2G 507M 3.7G 12% /var/cache/squid/cache03 /dev/sda6 20G 20G 598M 98% /users/dooki /dev/sda7 60G 60G 409M 100% /users/mica /dev/sda8 99G 35G 64G 36% /users/mare /dev/sda9 27G 33M 27G 1% /users/jeja /dev/sda10 9.4G 6.9G 2.5G 74% /users/ilija /dev/sda11 9.4G 735M 8.6G 8% /users/ana /dev/sdb5 37G 19G 18G 53% /users/sija /dev/sdb6 88G 37G 51G 42% /users/boki /dev/sdb7 105G 52G 54G 50% /users/zola /dev/sdc5 60G 57G 2.5G 96% /users/marko /dev/sdc6 4.7G 622M 4.1G 14% /users/transfer /dev/sdc7 20G 15G 5.0G 76% /users/vlada /dev/sdc8 138G 49G 90G 36% /users/dome /dev/sdc9 6.7G 33M 6.6G 1% /users/dusan /dev/sdd5 90G 77G 14G 86% /users/tnt /dev/sdd6 93G 82G 12G 88% /users/peleizoki /dev/sdd7 20G 8.3G 12G 42% /users/nesha /dev/sdd8 6.7G 1.4G 5.3G 22% /users/gergana none 502M 0 502M 0% /dev/shm none 1.5G 4.9M 1.5G 1% /tmp And I have the same problem even if I share only two jpegs on /tmp partition, and EVEN if I don't share anything (In both these cases /users/* partitions shouldn't be touched). I've tried to start just: strace -o /tmp/strace.log dctc -g 10.0.1.33 without any additional arguments and still got the same bug.
(In reply to comment #2) > Partition with the least of space available has 409MB of free space: ... which is probably reserved for root. Please, make some decent space on all partitions where this P2P thing saves data and try again. And don
(In reply to comment #2) > Partition with the least of space available has 409MB of free space: ... which is probably reserved for root. Please, make some decent space on all partitions where this P2P thing saves data and try again. And don´t run this as root, duh!
1. There's reiserfs on all partitions except /boot, so there shouldn't be any space reserved for root. 2. I get same problem as root, and IF there was space reserved for root, it should be used without problems when dctc is run by root. 3. My server just serves files - it doesn't download anything through dctc, and it doesn't need disk space for saving downloads (at least not more than 400MB). Files for sharing are copied via samba from window$ boxes. P.S. Thanks for advise. I normally run dctc as non-privileged user, but in this situation I've tried everything to locate the problem, so I've run dctc as a root, just to see if problem is permision-related. Unfortunately, got the same hang-up. :(
UPDATE: It seems that problem is semaphore-related (thanks to widan): >ENOSPC for semget has nothing to do with full disks. From semget's man page : >Code: >ENOSPC A semaphore set has to be created but the system limit for > the maximum number of semaphore sets (SEMMNI), or the system > wide maximum number of semaphores (SEMMNS), would be > exceeded. > >Could you run those: >Code: >ipcs -ls >ipcs >Maybe some program (maybe dctc, maybe some other one) created semaphores but >never deleted them, and the kernel table slowly filled up over time. If ipcs >returns a very long list under the title "semaphore arrays", there are probably >too many semaphore sets. And here we are! User 'titan' is used only for starting of dctc and he is owner of most semaphores: Code: titan / # ipcs -ls ------ Semaphore Limits -------- max number of arrays = 128 max semaphores per array = 250 max semaphores system wide = 32000 max ops per semop call = 32 semaphore max value = 32767 titan / # ipcs ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x10feed01 0 root 644 25656 4 ------ Semaphore Arrays -------- key semid owner perms nsems 0x00feed00 32768 root 644 2 0x00000000 3637249 apache 600 1 0x00000000 3670018 apache 600 1 0x6eab5d4e 131075 titan 600 10 0x73692b8e 163844 titan 600 10 0x096d3adf 196613 titan 600 10 0x6c3a7223 229382 titan 600 10 0x0be0cd95 262151 titan 600 10 0x3735deb4 294920 titan 600 10 0x4a5dbae7 327689 titan 600 10 0x0dd1f3bb 360458 titan 600 10 0x22d456f7 393227 titan 600 10 0x38bcb84b 425996 titan 600 10 0x60010ebe 458765 titan 600 10 0x0b86becb 491534 titan 600 10 0x60e931d2 524303 titan 600 10 0x36a3a232 557072 titan 600 10 0x0b91d49e 589841 titan 600 10 0x6117c0a8 622610 titan 600 10 0x0c22aa5c 655379 titan 600 10 0x0fe20bf4 688148 titan 600 10 0x3785fe61 720917 titan 600 10 0x3b025a35 753686 titan 600 10 0x6277846a 786455 titan 600 10 0x0ceacca9 819224 titan 600 10 0x22844cd2 851993 titan 600 10 0x77c84297 884762 titan 600 10 0x3b906f7d 917531 titan 600 10 0x1100f450 950300 titan 600 10 0x4dc9a6c6 983069 titan 600 10 0x23b467bf 1015838 titan 600 10 0x6c333e51 1048607 titan 600 10 0x66c59b8c 1081376 titan 600 10 0x4e553ab6 1114145 titan 600 10 0x23810e17 1146914 titan 600 10 0x0f02f590 1179683 titan 600 10 0x64a2fddd 1212452 titan 600 10 0x39998e77 1245221 titan 600 10 0x4f2c2bfb 1277990 titan 600 10 0x52d51f8d 1310759 titan 600 10 0x4f92c7f7 1343528 titan 600 10 0x256611c1 1376297 titan 600 10 0x3a5b6ab4 1409066 titan 600 10 0x3e78ecf5 1441835 titan 600 10 0x65ac2fa8 1474604 titan 600 10 0x50a678d2 1507373 titan 600 10 0x25a76d59 1540142 titan 600 10 0x7b9fe085 1572911 titan 600 10 0x50c3913c 1605680 titan 600 10 0x51e54d9f 1638449 titan 600 10 0x54d9ea05 1671218 titan 600 10 0x3c7ef9fa 1703987 titan 600 10 0x528cb509 1736756 titan 600 10 0x27464464 1769525 titan 600 10 0x520d0f86 1802294 titan 600 10 0x67c62b84 1835063 titan 600 10 0x7d3d9eb9 1867832 titan 600 10 0x12f85944 1900601 titan 600 10 0x563ba1d0 1933370 titan 600 10 0x5361de35 1966139 titan 600 10 0x175f2692 1998908 titan 600 10 0x3eba763b 2031677 titan 600 10 0x2920dd9e 2064446 titan 600 10 0x5197c1bd 2097215 titan 600 10 0x2a181cf9 2129984 titan 600 10 0x3f41418d 2162753 titan 600 10 0x1548edbf 2195522 titan 600 10 0x2abb8b33 2228291 titan 600 10 0x14f38722 2261060 titan 600 10 0x2a97c547 2293829 titan 600 10 0x4028d5bc 2326598 titan 600 10 0x03ab7869 2359367 titan 600 10 0x6bef1878 2392136 titan 600 10 0x78dbad7a 2424905 titan 600 10 0x2bd4c0fe 2457674 titan 600 10 0x6ed33ac8 2490443 titan 600 10 0x164f267e 2523212 titan 600 10 0x6c167d48 2555981 titan 600 10 0x3d490107 2588750 titan 600 10 0x6c6b35df 2621519 titan 600 10 0x2feb048a 2654288 titan 600 10 0x56c0a560 2687057 titan 600 10 0x494f0af8 2719826 titan 600 10 0x25cacfb3 2752595 titan 600 10 0x22b3c15f 2785364 titan 600 10 0x37da8a3f 2818133 titan 600 10 0x0dfa2df1 2850902 titan 600 10 0x62cdfae5 2883671 titan 600 10 0x78d64081 2916440 titan 600 10 0x23f120c0 2949209 titan 600 10 0x78f822fd 2981978 titan 600 10 0x4e2f906e 3014747 titan 600 10 0x6382d3c6 3047516 titan 600 10 0x38d7c1be 3080285 titan 600 10 0x7dc60b23 3113054 titan 600 10 0x5293e65d 3145823 titan 600 10 0x7a429154 3178592 titan 600 10 0x0f451b1f 3211361 titan 600 10 0x64f9111d 3244130 titan 600 10 0x27b5e76f 3276899 titan 600 10 0x6557b205 3309668 titan 600 10 0x7a92b297 3342437 titan 600 10 0x10d89ce7 3375206 titan 600 10 0x13d50053 3407975 titan 600 10 0x3b4947eb 3539048 titan 600 10 0x65765f1d 3571817 titan 600 10 0x7ba28d61 3702890 titan 600 10 0x647cfccb 3735659 titan 600 10 0x549b7a16 3768428 titan 600 10 0x6a1c22e3 3801197 titan 600 10 0x552ed449 3833966 titan 600 10 0x3c6afe99 3866735 titan 600 10 0x4076b217 3899504 titan 600 10 0x15e70076 3932273 titan 600 10 0x7d6634c0 3965042 titan 600 10 0x682fc620 3997811 titan 600 10 0x6b64e8c7 4030580 titan 600 10 0x1383bc26 4063349 titan 600 10 0x56a5ea49 4096118 titan 600 10 0x7e73e2ff 4128887 titan 600 10 0x16d23b83 4161656 titan 600 10 0x2729074e 4194425 titan 600 10 0x41b68535 4227194 titan 600 10 0x293db85d 4259963 titan 600 10 0x6ce24989 4292732 titan 600 10 0x17bed7a0 4325501 titan 600 10 0x3fb77fab 4358270 titan 600 10 0x6a6294c3 4391039 titan 600 10 ------ Message Queues -------- key msqid owner perms used-bytes messages titan / # Please, see details here: http://forums.gentoo.org/viewtopic-t-342985.html
by widan (great linux mag from France): First, you need to delete the ones that exist, to free up the kernel table entries : Code: for i in $(ipcs -s | grep titan | cut -d ' ' -f 1); do ipcrm -S $i; done As to why it leaks semaphores, it's "normal" if you kill dctc that way: programs killed with SIGKILL don't get a chance to clean up. This is usually not a problem, but it is for IPC objects (semaphores, shared mem space and message queues) : they are not associated to a process, so the kernel can't reclaim them. You could try to kill with SIGTERM: Code: killall -TERM dctc killall -TERM dctc_master It's a bit less aggressive, and dctc might handle it better. If it fails, you could use a script that does both kills and runs the code above to delete the IPCs. So, I've deleted all semaphores that were owned by 'titan' user and dctc started normaly. I couldn't stop dctc and dctc_master by sending -TERM signal, so I had to make stop script like this: Code: #!/bin/bash killall -9 dctc 2>&1 killall -9 dctc_master 2>&1 sleep 2 for i in $(ipcs -s | grep titan | cut -d ' ' -f 1); do ipcrm -S $i; done rm -f -r /home/titan/.dctc 2>&1 It works just fine for now and I have no semaphores owned by 'titan' when I turn off dctc. User 'titan' is not used to do anything else except starting dctc, so I guess it will not be a problem to delete all his semaphores everytime I kill dctc. http://forums.gentoo.org/viewtopic-t-342985.html