on an embedded superh box, running `emerge gcc -1u` shows the main emerge python process constantly using cpu. this seems to be due to constant polling of the output ? for example, `ps aux` shows: ... root 16561 39.4 1.5 48512 936 pts/4 S+ Aug16 2240:51 /usr/bin/python2.6 /usr/bin/emerge sys-devel/gcc -1u ... root 7976 66.9 66.8 75660 40752 pts/4 D+ 23:42 0:52 /usr/sh4-unknown-linux-gnu/bin/ld --eh-frame-hdr -m shlelf_linux -export-dynamic -dynamic-linker /lib/ld-linux.so.2 -o cc1 /usr/lib/crt1.o ... there's really no reason that emerge process should be sucking up so much time. running strace against it, i see: gettimeofday({1345434286, 75025}, NULL) = 0 poll([{fd=9, events=POLLIN|POLLERR|POLLHUP|POLLNVAL}, {fd=7, events=POLLIN|POLLERR|POLLHUP|POLLNVAL}], 2, 3000) = 0 (Timeout) gettimeofday({1345434289, 89291}, NULL) = 0 gettimeofday({1345434289, 90747}, NULL) = 0 poll([{fd=9, events=POLLIN|POLLERR|POLLHUP|POLLNVAL}, {fd=7, events=POLLIN|POLLERR|POLLHUP|POLLNVAL}], 2, 3000) = 1 ([{fd=9, revents=POLLIN}]) read(9, "/var/tmp/portage/sys-devel/gcc-4"..., 4096) = 1135 write(13, "/var/tmp/portage/sys-devel/gcc-4"..., 1135) = 1135 write(12, "/var/tmp/portage/sys-devel/gcc-4"..., 1135) = 1135 read(9, 0x19a5784, 4096) = -1 EAGAIN (Resource temporarily unavailable) gettimeofday({1345434293, 93010}, NULL) = 0 gettimeofday({1345434293, 96096}, NULL) = 0 poll([{fd=9, events=POLLIN|POLLERR|POLLHUP|POLLNVAL}, {fd=7, events=POLLIN|POLLERR|POLLHUP|POLLNVAL}], 2, 3000) = 0 (Timeout) <lots more gettimeofday/poll calls> looking at the fds of that process, i have: /proc/16561/fd/ 12 -> /var/log/portage/sys-devel:gcc-4.5.3-r2:20120816-065718.log 13 -> /dev/pts/4 2 -> /dev/pts/4 3 -> pipe:[1289435] 4 -> pipe:[1289435] 5 -> /var/tmp/portage/sys-devel/.gcc-4.5.3-r2.portage_lockfile 7 -> /var/tmp/portage/sys-devel/gcc-4.5.3-r2/.ipc_in 9 -> /dev/ptmx so it's polling fd 9 constantly, and whenever it gets output, it writes it to the tty (13) and the log file (12). then restarts the whole process. shouldn't this use a blocking call ? why are we polling with the poll syscall ?
(In reply to comment #0) > shouldn't this use a blocking call ? It does block until it hits the 3 second timeout, which shows as 3000 ms in your strace log. The timeout is for it to update the load avg display. > why are we polling with the poll syscall ? We could use epoll instead, since python has bindings for that too.
(In reply to comment #1) there is no load avg display (i'm not using --jobs or --quiet-build). you can see that there aren't any syscalls made other than the poll/gettimeofday, which means portage isn't querying for loadavg stats here. maybe the timeout logic should be checking to see if there's any load display to be updating in the first place ?
(In reply to comment #2) > maybe the timeout logic should be checking to see if there's any load > display to be updating in the first place ? Yeah, this should do it: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=31b41fe7a20e6e0d19ff2383a746d61d8aa18b50
(In reply to comment #1) > (In reply to comment #0) > > why are we polling with the poll syscall ? > > We could use epoll instead, since python has bindings for that too. This is implemented now: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=b986bcdd49c5523ffe6972377071d556a819c776
This is fixed in 2.1.11.11 and 2.2.0_alpha122.
In portage-2.1.11.22 and 2.2.0_alpha133 there's a fix for a related, and much worse bug: http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=2b2580d9dac62aa720e5d996fa5102ee5caeffe7