It certainly didn't do this before _p11. Confirmed on multiple systems (three architectures, different kernels, most are on stable profiles). The only step to reproduce I can think of is to keep top running all the time. It looks like the memory leak progresses as more processes are run, so perhaps a system that uses up many process IDs (running many cron jobs, for instance) will have top leak more memory. When I recycle a few dozen pids a second, a couple of kilobytes heap on to the memory resident size (RES, as reported by top itself).
can't reproduce here one terminal: $ top another terminal: $ while :; do (:); done memory usage in top stays static
[1986981.196642] Out of memory: Kill process 2588 (top) score 490 or sacrifice child
(In reply to comment #1) > can't reproduce here So maybe that was not a good recipe.
(In reply to comment #2) > [1986981.196642] Out of memory: Kill process 2588 (top) score 490 or sacrifice > child That was on an x86, top was killed after 22 days, apparently. On an amd64 system: PID USER PR NI SHR S TIME+ RES %MEM SWAP %CPU COMMAND 2959 root 20 0 608 R 156:06.11 1176 0.0 15m 0.2 top Using 15 megabytes swappable is way too much, so it's clear where this is heading. On an older x86 system: PID USER GROUP PR NI SHR S %MEM RES TIME+ SWAP %CPU COMMAND 12212 root root 20 0 876 R 8.9 89m 22:49.59 9980 1.3 top I estimate from the system's uptime that this top has been running for 21 days and amassed 89 megabytes in resident memory.
(In reply to comment #3) i'm not going to just try random crap until i happen to trip over something. produce an actual example or debug it yourself.
==20718== ==20718== HEAP SUMMARY: ==20718== in use at exit: 18,832,631 bytes in 493,861 blocks ==20718== total heap usage: 2,072,387 allocs, 1,578,526 frees, 1,213,961,896 bytes allocated ==20718== ==20718== LEAK SUMMARY: ==20718== definitely lost: 18,743,112 bytes in 493,639 blocks ==20718== indirectly lost: 240 bytes in 20 blocks ==20718== possibly lost: 64 bytes in 2 blocks ==20718== still reachable: 89,215 bytes in 200 blocks ==20718== suppressed: 0 bytes in 0 blocks ==20718== Rerun with --leak-check=full to see details of leaked memory ==20718== ==20718== For counts of detected and suppressed errors, rerun with: -v ==20718== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 7)
Created attachment 298501 [details] valgrind log
Created attachment 298521 [details] valgrind log with debug CFLAGS for procps
(In reply to comment #5) > (In reply to comment #3) > > i'm not going to just try random crap until i happen to trip over something. > produce an actual example or debug it yourself. I'll just ignore that and reopen, right?
Tue Jan 10 20:15:01 CET 2012 Portage 2.1.10.44 (default/linux/x86/10.0, gcc-4.5.3, glibc-2.13-r4, 3.0.6-gentoo-JeR i686) ================================================================= System Settings ================================================================= System uname: Linux-3.0.6-gentoo-JeR-i686-Intel-R-_Pentium-R-_4_CPU_2.60GHz-with-gentoo-2.0.3 Timestamp of tree: Tue, 10 Jan 2012 16:15:01 +0000 distcc 3.1 i686-pc-linux-gnu [disabled] app-shells/bash: 4.1_p9 dev-java/java-config: 2.1.11-r3 dev-lang/python: 2.7.2-r3 dev-util/pkgconfig: 0.26 sys-apps/baselayout: 2.0.3 sys-apps/openrc: 0.9.4 sys-apps/sandbox: 2.5 sys-devel/autoconf: 2.68 sys-devel/automake: 1.11.1 sys-devel/binutils: 2.21.1-r1 sys-devel/gcc: 4.4.5, 4.5.3-r1 sys-devel/gcc-config: 1.4.1-r1 sys-devel/libtool: 2.4-r1 sys-devel/make: 3.82-r1 sys-kernel/linux-headers: 2.6.39 (virtual/os-headers) sys-libs/glibc: 2.13-r4 Repositories: gentoo JeR ACCEPT_KEYWORDS="x86" ACCEPT_LICENSE="*" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=pentium4 --param l1-cache-size=8 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=pentium4 -pipe -Wall" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo" CXXFLAGS="-O2 -march=pentium4 --param l1-cache-size=8 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=pentium4 -pipe -Wall" DISTDIR="/world/distfiles" EMERGE_DEFAULT_OPTS="--quiet-build=n" FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles news parallel-fetch protect-owned sandbox sfperms splitdebug strict test-fail-continue unknown-features-warn unmerge-logs unmerge-orphans userfetch" FFLAGS="" GENTOO_MIRRORS="http://distfiles.gentoo.org" LC_ALL="en_US.UTF-8" LDFLAGS="-Wl,-O1 -Wl,--as-needed" MAKEOPTS="-j4" PKGDIR="/world/gentoo/packages/bastiaan" PORTAGE_CONFIGROOT="/" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/world/gentoo/portage" PORTDIR_OVERLAY="/keeps/gentoo/local" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="bash-completion berkdb bzip2 cli cracklib crypt cups cxx dri fortran gdbm gpm iconv ipv6 modules mtrr mudflap multislot ncurses nls nptl nptlonly openmp pam pcre pppd readline session sse sse2 ssl sysfs tcpd unicode x86 xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account" Unset: CPPFLAGS, CTARGET, INSTALL_MASK, LANG, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS ================================================================= Package Settings ================================================================= sys-process/procps-3.2.8_p11 was built with the following: USE="unicode" CFLAGS="-O0 -ggdb -pipe -Wall"
(In reply to comment #9) quit your childish antics
(In reply to comment #11) > quit your childish antics Quid pro quo? :)
(In reply to comment #12) no, it's not. if you can't file/follow up bugs without acting like a child, then find someone capable of filing bugs for you.
Created attachment 298865 [details] top.valgrind-debug-2.log
do you have a /etc/toprc or ~/.toprc ?
Created attachment 298929 [details] .toprc from the worst affected system
i think there's at least partial leakage here: --- a/top.c +++ b/top.c @@ -1154,7 +1154,7 @@ prochlp(ttsk); ++curmax; } - free(ptsk); // readproc() proc_t not used + freeproc(ptsk); // readproc() proc_t not used } } @@ -1181,7 +1181,7 @@ prochlp(ttsk); table[curmax++] = ttsk; } - free(ptsk); // readproc() proc_t not used + freeproc(ptsk); // readproc() proc_t not used } } }
==4641== HEAP SUMMARY: ==4641== in use at exit: 93,415 bytes in 166 blocks ==4641== total heap usage: 126,078 allocs, 125,912 frees, 1,072,674,643 bytes allocated ==4641== ==4641== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 20 of 33 ==4641== at 0x4025103: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==4641== by 0x416D114: nss_parse_service_list (in /lib/libc-2.13.so) ==4641== by 0x416D64D: __nss_database_lookup (in /lib/libc-2.13.so) ==4641== by 0x45ECFFB: ??? ==4641== by 0x45EEA96: ??? ==4641== by 0x4128227: getpwuid_r@@GLIBC_2.1.2 (in /lib/libc-2.13.so) ==4641== by 0x4127BC2: getpwuid (in /lib/libc-2.13.so) ==4641== by 0x4030D60: user_from_uid (pwcache.c:42) ==4641== by 0x4038134: simple_readproc (readproc.c:603) ==4641== by 0x4038C65: readproc (readproc.c:841) ==4641== by 0x804C56F: procs_refresh (top.c:1167) ==4641== by 0x8050E8E: summary_show (top.c:3003) ==4641== ==4641== LEAK SUMMARY: ==4641== definitely lost: 36 bytes in 1 blocks ==4641== indirectly lost: 120 bytes in 10 blocks ==4641== possibly lost: 0 bytes in 0 blocks ==4641== still reachable: 93,259 bytes in 155 blocks ==4641== suppressed: 0 bytes in 0 blocks ==4641== Reachable blocks (those to which a pointer was found) are not shown. ==4641== To see them, rerun with: --leak-check=full --show-reachable=yes ==4641== ==4641== For counts of detected and suppressed errors, rerun with: -v ==4641== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 7 from 7) With the default configuration only a few bytes are lost.
i think most of the leakage occurs from the group usage in the proc_t struct. but the code is horrible, so it's hard to trace where things should be cleaned up without breaking it all.
Created attachment 299159 [details] with my .toprc, patched as in comment #17, running for ~11 hours With my .toprc, the patch from comment #17, running for ~11 hours. Still more than 25 megabytes lost.
(In reply to comment #19) > i think most of the leakage occurs from the group usage in the proc_t struct. > but the code is horrible, so it's hard to trace where things should be cleaned > up without breaking it all. The Debian bug report states that the version 3.* top should be significantly cleaned up/rewritten/etc.
(In reply to comment #21) > (In reply to comment #19) > > i think most of the leakage occurs from the group usage in the proc_t struct. > > but the code is horrible, so it's hard to trace where things should be cleaned > > up without breaking it all. > > The Debian bug report states that the version 3.* top should be significantly > cleaned up/rewritten/etc. Other than that, not displaying the Group field fixes most of the leakage.
procps-3.3.2_p2 is in the tree if you want to give that a spin ...
(In reply to comment #23) > procps-3.3.2_p2 is in the tree if you want to give that a spin ... Does not appear to leak very much (at least not kilobytes per update). Do note that the old .toprc could not be used, but that with the Group field enabled, nothing extra leaked. ==4996== ==4996== HEAP SUMMARY: ==4996== in use at exit: 434,813 bytes in 261 blocks ==4996== total heap usage: 991,086 allocs, 990,825 frees, 315,328,159 bytes allocated ==4996== ==4996== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 33 of 50 ==4996== at 0x4025103: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==4996== by 0x4168114: nss_parse_service_list (in /lib/libc-2.13.so) ==4996== by 0x416864D: __nss_database_lookup (in /lib/libc-2.13.so) ==4996== by 0x47AAFFB: ??? ==4996== by 0x47ACA96: ??? ==4996== by 0x4123227: getpwuid_r@@GLIBC_2.1.2 (in /lib/libc-2.13.so) ==4996== by 0x4122BC2: getpwuid (in /lib/libc-2.13.so) ==4996== by 0x402F1F6: user_from_uid (in /lib/libprocps.so.0.0.0) ==4996== by 0x4030F18: simple_readproc (in /lib/libprocps.so.0.0.0) ==4996== by 0x40312C1: readproc (in /lib/libprocps.so.0.0.0) ==4996== by 0x804C49F: procs_refresh (in /usr/bin/top) ==4996== by 0x804EA0F: frame_make (in /usr/bin/top) ==4996== ==4996== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 34 of 50 ==4996== at 0x4025103: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==4996== by 0x4168114: nss_parse_service_list (in /lib/libc-2.13.so) ==4996== by 0x416864D: __nss_database_lookup (in /lib/libc-2.13.so) ==4996== by 0x47AA16B: ??? ==4996== by 0x47AAFB5: ??? ==4996== by 0x4121CB7: getgrgid_r@@GLIBC_2.1.2 (in /lib/libc-2.13.so) ==4996== by 0x41214CE: getgrgid (in /lib/libc-2.13.so) ==4996== by 0x402F2E2: group_from_gid (in /lib/libprocps.so.0.0.0) ==4996== by 0x4030E38: simple_readproc (in /lib/libprocps.so.0.0.0) ==4996== by 0x40312C1: readproc (in /lib/libprocps.so.0.0.0) ==4996== by 0x804C418: procs_refresh (in /usr/bin/top) ==4996== by 0x804E1A2: frame_make (in /usr/bin/top) ==4996== ==4996== LEAK SUMMARY: ==4996== definitely lost: 72 bytes in 2 blocks ==4996== indirectly lost: 240 bytes in 20 blocks ==4996== possibly lost: 0 bytes in 0 blocks ==4996== still reachable: 434,501 bytes in 239 blocks ==4996== suppressed: 0 bytes in 0 blocks ==4996== Reachable blocks (those to which a pointer was found) are not shown. ==4996== To see them, rerun with: --leak-check=full --show-reachable=yes ==4996== ==4996== For counts of detected and suppressed errors, rerun with: -v ==4996== ERROR SUMMARY: 312122 errors from 14 contexts (suppressed: 7 from 7)