Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 397771 - =sys-process/procps-3.2.8_p11 - top leaks memory
Summary: =sys-process/procps-3.2.8_p11 - top leaks memory
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-01-05 16:23 UTC by Jeroen Roovers (RETIRED)
Modified: 2014-01-16 13:29 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
valgrind log (top.valgrind.log,7.93 KB, text/plain)
2012-01-10 14:45 UTC, Jeroen Roovers (RETIRED)
Details
valgrind log with debug CFLAGS for procps (top.valgrind-debug.log,10.50 KB, text/plain)
2012-01-10 19:11 UTC, Jeroen Roovers (RETIRED)
Details
top.valgrind-debug-2.log (top.valgrind-debug-2.log,12.23 KB, text/plain)
2012-01-13 17:52 UTC, Jeroen Roovers (RETIRED)
Details
.toprc from the worst affected system (bastiaan.toprc,617 bytes, text/plain)
2012-01-14 15:53 UTC, Jeroen Roovers (RETIRED)
Details
with my .toprc, patched as in comment #17, running for ~11 hours (top.valgrind-debug-3.log,6.61 KB, text/plain)
2012-01-17 15:49 UTC, Jeroen Roovers (RETIRED)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Jeroen Roovers (RETIRED) gentoo-dev 2012-01-05 16:23:46 UTC
It certainly didn't do this before _p11. Confirmed on multiple systems (three architectures, different kernels, most are on stable profiles).

The only step to reproduce I can think of is to keep top running all the time. It looks like the memory leak progresses as more processes are run, so perhaps a system that uses up many process IDs (running many cron jobs, for instance) will have top leak more memory. When I recycle a few dozen pids a second, a couple of kilobytes heap on to the memory resident size (RES, as reported by top itself).
Comment 1 SpanKY gentoo-dev 2012-01-06 06:07:04 UTC
can't reproduce here

one terminal:
 $ top

another terminal:
 $ while :; do (:); done

memory usage in top stays static
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-06 18:07:13 UTC
[1986981.196642] Out of memory: Kill process 2588 (top) score 490 or sacrifice child
Comment 3 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-06 18:07:31 UTC
(In reply to comment #1)
> can't reproduce here

So maybe that was not a good recipe.
Comment 4 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-06 18:17:15 UTC
(In reply to comment #2)
> [1986981.196642] Out of memory: Kill process 2588 (top) score 490 or sacrifice
> child

That was on an x86, top was killed after 22 days, apparently.

On an amd64 system:
  PID USER      PR  NI  SHR S    TIME+   RES %MEM SWAP %CPU COMMAND
 2959 root      20   0  608 R 156:06.11 1176  0.0  15m  0.2 top

Using 15 megabytes swappable is way too much, so it's clear where this is heading.

On an older x86 system:
  PID USER     GROUP     PR  NI  SHR S %MEM  RES    TIME+  SWAP %CPU COMMAND
12212 root     root      20   0  876 R  8.9  89m  22:49.59 9980  1.3 top

I estimate from the system's uptime that this top has been running for 21 days and amassed 89 megabytes in resident memory.
Comment 5 SpanKY gentoo-dev 2012-01-06 20:05:14 UTC
(In reply to comment #3)

i'm not going to just try random crap until i happen to trip over something.  produce an actual example or debug it yourself.
Comment 6 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-09 20:12:38 UTC
==20718== 
==20718== HEAP SUMMARY:
==20718==     in use at exit: 18,832,631 bytes in 493,861 blocks
==20718==   total heap usage: 2,072,387 allocs, 1,578,526 frees, 1,213,961,896 bytes allocated
==20718== 
==20718== LEAK SUMMARY:
==20718==    definitely lost: 18,743,112 bytes in 493,639 blocks
==20718==    indirectly lost: 240 bytes in 20 blocks
==20718==      possibly lost: 64 bytes in 2 blocks
==20718==    still reachable: 89,215 bytes in 200 blocks
==20718==         suppressed: 0 bytes in 0 blocks
==20718== Rerun with --leak-check=full to see details of leaked memory
==20718== 
==20718== For counts of detected and suppressed errors, rerun with: -v
==20718== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 7)
Comment 7 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-10 14:45:48 UTC
Created attachment 298501 [details]
valgrind log
Comment 8 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-10 19:11:10 UTC
Created attachment 298521 [details]
valgrind log with debug CFLAGS for procps
Comment 9 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-10 19:12:44 UTC
(In reply to comment #5)
> (In reply to comment #3)
> 
> i'm not going to just try random crap until i happen to trip over something. 
> produce an actual example or debug it yourself.

I'll just ignore that and reopen, right?
Comment 10 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-10 19:15:24 UTC
Tue Jan 10 20:15:01 CET 2012
Portage 2.1.10.44 (default/linux/x86/10.0, gcc-4.5.3, glibc-2.13-r4, 3.0.6-gentoo-JeR i686)
=================================================================
                        System Settings
=================================================================
System uname: Linux-3.0.6-gentoo-JeR-i686-Intel-R-_Pentium-R-_4_CPU_2.60GHz-with-gentoo-2.0.3
Timestamp of tree: Tue, 10 Jan 2012 16:15:01 +0000
distcc 3.1 i686-pc-linux-gnu [disabled]
app-shells/bash:          4.1_p9
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.7.2-r3
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.0.3
sys-apps/openrc:          0.9.4
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.68
sys-devel/automake:       1.11.1
sys-devel/binutils:       2.21.1-r1
sys-devel/gcc:            4.4.5, 4.5.3-r1
sys-devel/gcc-config:     1.4.1-r1
sys-devel/libtool:        2.4-r1
sys-devel/make:           3.82-r1
sys-kernel/linux-headers: 2.6.39 (virtual/os-headers)
sys-libs/glibc:           2.13-r4
Repositories: gentoo JeR
ACCEPT_KEYWORDS="x86"
ACCEPT_LICENSE="*"
CBUILD="i686-pc-linux-gnu"
CFLAGS="-O2 -march=pentium4 --param l1-cache-size=8 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=pentium4 -pipe -Wall"
CHOST="i686-pc-linux-gnu"
CONFIG_PROTECT="/etc"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/terminfo"
CXXFLAGS="-O2 -march=pentium4 --param l1-cache-size=8 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=pentium4 -pipe -Wall"
DISTDIR="/world/distfiles"
EMERGE_DEFAULT_OPTS="--quiet-build=n"
FEATURES="assume-digests binpkg-logs distlocks ebuild-locks fixlafiles news parallel-fetch protect-owned sandbox sfperms splitdebug strict test-fail-continue unknown-features-warn unmerge-logs unmerge-orphans userfetch"
FFLAGS=""
GENTOO_MIRRORS="http://distfiles.gentoo.org"
LC_ALL="en_US.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
MAKEOPTS="-j4"
PKGDIR="/world/gentoo/packages/bastiaan"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/world/gentoo/portage"
PORTDIR_OVERLAY="/keeps/gentoo/local"
SYNC="rsync://rsync.gentoo.org/gentoo-portage"
USE="bash-completion berkdb bzip2 cli cracklib crypt cups cxx dri fortran gdbm gpm iconv ipv6 modules mtrr mudflap multislot ncurses nls nptl nptlonly openmp pam pcre pppd readline session sse sse2 ssl sysfs tcpd unicode x86 xorg zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="fbdev glint intel mach64 mga neomagic nouveau nv r128 radeon savage sis tdfx trident vesa via vmware dummy v4l" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, LANG, LINGUAS, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS

=================================================================
                        Package Settings
=================================================================

sys-process/procps-3.2.8_p11 was built with the following:
USE="unicode"
CFLAGS="-O0 -ggdb -pipe -Wall"
Comment 11 SpanKY gentoo-dev 2012-01-10 19:54:46 UTC
(In reply to comment #9)

quit your childish antics
Comment 12 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-10 19:57:48 UTC
(In reply to comment #11)

> quit your childish antics

Quid pro quo? :)
Comment 13 SpanKY gentoo-dev 2012-01-10 20:37:04 UTC
(In reply to comment #12)

no, it's not.  if you can't file/follow up bugs without acting like a child, then find someone capable of filing bugs for you.
Comment 14 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-13 17:52:16 UTC
Created attachment 298865 [details]
top.valgrind-debug-2.log
Comment 15 SpanKY gentoo-dev 2012-01-14 10:28:51 UTC
do you have a /etc/toprc or ~/.toprc ?
Comment 16 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-14 15:53:34 UTC
Created attachment 298929 [details]
.toprc from the worst affected system
Comment 17 SpanKY gentoo-dev 2012-01-15 02:59:05 UTC
i think there's at least partial leakage here:

--- a/top.c
+++ b/top.c
@@ -1154,7 +1154,7 @@
             prochlp(ttsk);
             ++curmax;
          }
-         free(ptsk);  // readproc() proc_t not used
+         freeproc(ptsk);  // readproc() proc_t not used
       }
    }
 
@@ -1181,7 +1181,7 @@
                prochlp(ttsk);
                table[curmax++] = ttsk;
             }
-            free(ptsk);   // readproc() proc_t not used
+            freeproc(ptsk);   // readproc() proc_t not used
          }
       }
    }
Comment 18 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-15 19:33:40 UTC
==4641== HEAP SUMMARY:
==4641==     in use at exit: 93,415 bytes in 166 blocks
==4641==   total heap usage: 126,078 allocs, 125,912 frees, 1,072,674,643 bytes allocated
==4641== 
==4641== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 20 of 33
==4641==    at 0x4025103: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==4641==    by 0x416D114: nss_parse_service_list (in /lib/libc-2.13.so)
==4641==    by 0x416D64D: __nss_database_lookup (in /lib/libc-2.13.so)
==4641==    by 0x45ECFFB: ???
==4641==    by 0x45EEA96: ???
==4641==    by 0x4128227: getpwuid_r@@GLIBC_2.1.2 (in /lib/libc-2.13.so)
==4641==    by 0x4127BC2: getpwuid (in /lib/libc-2.13.so)
==4641==    by 0x4030D60: user_from_uid (pwcache.c:42)
==4641==    by 0x4038134: simple_readproc (readproc.c:603)
==4641==    by 0x4038C65: readproc (readproc.c:841)
==4641==    by 0x804C56F: procs_refresh (top.c:1167)
==4641==    by 0x8050E8E: summary_show (top.c:3003)
==4641== 
==4641== LEAK SUMMARY:
==4641==    definitely lost: 36 bytes in 1 blocks
==4641==    indirectly lost: 120 bytes in 10 blocks
==4641==      possibly lost: 0 bytes in 0 blocks
==4641==    still reachable: 93,259 bytes in 155 blocks
==4641==         suppressed: 0 bytes in 0 blocks
==4641== Reachable blocks (those to which a pointer was found) are not shown.
==4641== To see them, rerun with: --leak-check=full --show-reachable=yes
==4641== 
==4641== For counts of detected and suppressed errors, rerun with: -v
==4641== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 7 from 7)

With the default configuration only a few bytes are lost.
Comment 19 SpanKY gentoo-dev 2012-01-17 06:52:20 UTC
i think most of the leakage occurs from the group usage in the proc_t struct.  but the code is horrible, so it's hard to trace where things should be cleaned up without breaking it all.
Comment 20 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-17 15:49:25 UTC
Created attachment 299159 [details]
with my .toprc, patched as in comment #17, running for ~11 hours

With my .toprc, the patch from comment #17, running for ~11 hours. Still more than 25 megabytes lost.
Comment 21 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-17 15:50:33 UTC
(In reply to comment #19)
> i think most of the leakage occurs from the group usage in the proc_t struct. 
> but the code is horrible, so it's hard to trace where things should be cleaned
> up without breaking it all.

The Debian bug report states that the version 3.* top should be significantly cleaned up/rewritten/etc.
Comment 22 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-17 16:56:19 UTC
(In reply to comment #21)
> (In reply to comment #19)
> > i think most of the leakage occurs from the group usage in the proc_t struct. 
> > but the code is horrible, so it's hard to trace where things should be cleaned
> > up without breaking it all.
> 
> The Debian bug report states that the version 3.* top should be significantly
> cleaned up/rewritten/etc.

Other than that, not displaying the Group field fixes most of the leakage.
Comment 23 SpanKY gentoo-dev 2012-01-24 06:18:46 UTC
procps-3.3.2_p2 is in the tree if you want to give that a spin ...
Comment 24 Jeroen Roovers (RETIRED) gentoo-dev 2012-01-24 16:43:42 UTC
(In reply to comment #23)
> procps-3.3.2_p2 is in the tree if you want to give that a spin ...

Does not appear to leak very much (at least not kilobytes per update).

Do note that the old .toprc could not be used, but that with the Group field enabled, nothing extra leaked.

==4996== 
==4996== HEAP SUMMARY:
==4996==     in use at exit: 434,813 bytes in 261 blocks
==4996==   total heap usage: 991,086 allocs, 990,825 frees, 315,328,159 bytes allocated
==4996== 
==4996== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 33 of 50
==4996==    at 0x4025103: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==4996==    by 0x4168114: nss_parse_service_list (in /lib/libc-2.13.so)
==4996==    by 0x416864D: __nss_database_lookup (in /lib/libc-2.13.so)
==4996==    by 0x47AAFFB: ???
==4996==    by 0x47ACA96: ???
==4996==    by 0x4123227: getpwuid_r@@GLIBC_2.1.2 (in /lib/libc-2.13.so)
==4996==    by 0x4122BC2: getpwuid (in /lib/libc-2.13.so)
==4996==    by 0x402F1F6: user_from_uid (in /lib/libprocps.so.0.0.0)
==4996==    by 0x4030F18: simple_readproc (in /lib/libprocps.so.0.0.0)
==4996==    by 0x40312C1: readproc (in /lib/libprocps.so.0.0.0)
==4996==    by 0x804C49F: procs_refresh (in /usr/bin/top)
==4996==    by 0x804EA0F: frame_make (in /usr/bin/top)
==4996== 
==4996== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 34 of 50
==4996==    at 0x4025103: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==4996==    by 0x4168114: nss_parse_service_list (in /lib/libc-2.13.so)
==4996==    by 0x416864D: __nss_database_lookup (in /lib/libc-2.13.so)
==4996==    by 0x47AA16B: ???
==4996==    by 0x47AAFB5: ???
==4996==    by 0x4121CB7: getgrgid_r@@GLIBC_2.1.2 (in /lib/libc-2.13.so)
==4996==    by 0x41214CE: getgrgid (in /lib/libc-2.13.so)
==4996==    by 0x402F2E2: group_from_gid (in /lib/libprocps.so.0.0.0)
==4996==    by 0x4030E38: simple_readproc (in /lib/libprocps.so.0.0.0)
==4996==    by 0x40312C1: readproc (in /lib/libprocps.so.0.0.0)
==4996==    by 0x804C418: procs_refresh (in /usr/bin/top)
==4996==    by 0x804E1A2: frame_make (in /usr/bin/top)
==4996== 
==4996== LEAK SUMMARY:
==4996==    definitely lost: 72 bytes in 2 blocks
==4996==    indirectly lost: 240 bytes in 20 blocks
==4996==      possibly lost: 0 bytes in 0 blocks
==4996==    still reachable: 434,501 bytes in 239 blocks
==4996==         suppressed: 0 bytes in 0 blocks
==4996== Reachable blocks (those to which a pointer was found) are not shown.
==4996== To see them, rerun with: --leak-check=full --show-reachable=yes
==4996== 
==4996== For counts of detected and suppressed errors, rerun with: -v
==4996== ERROR SUMMARY: 312122 errors from 14 contexts (suppressed: 7 from 7)