Summary: | sys-apps/portage: PipeReaderPtyTestCase tests fail intermittently | ||
---|---|---|---|
Product: | Portage Development | Reporter: | Patrick Lauer <patrick> |
Component: | Core | Assignee: | Portage team <dev-portage> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | nikoli, tka |
Priority: | Normal | Keywords: | InVCS |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: | https://bugs.gentoo.org/show_bug.cgi?id=330937 | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | |||
Bug Blocks: | 484436 | ||
Attachments: |
standalone test case demonstrating that EPOLLHUP triggers the issue
standalone test case demonstrating that EPOLLHUP triggers the issue standalone test case demonstrating that EPOLLHUP triggers the issue standalone test case (read regardless of EPOLLIN like portage-2.2.17 does) C equivalent standalone test case (read regardless of EPOLLIN like portage-2.2.17 does) |
Description
Patrick Lauer
2014-12-05 02:55:54 UTC
portage.tests.process.test_poll.PipeReaderPtyTestCase.testPipeReader() has been failing sporadically for long time (over a year), so it is not a regression in Portage 2.2.15. I suspect that this is triggered when os.read raises errno.EIO or errno.EAGAIN inside AbstractPollTask._read_buf. It would be similar to http://bugs.python.org/issue5380, but involving os.read instead of array.fromfile. Created attachment 392412 [details]
standalone test case demonstrating that EPOLLHUP triggers the issue
This test case shows that the problem is always triggered by an EPOLLHUP event. If I change the test case to ignore EPOLLHUP, then the test always succeeds.
Created attachment 392414 [details]
standalone test case demonstrating that EPOLLHUP triggers the issue
This updated test case fixes faulty fork/pid logic.
(In reply to Zac Medico from comment #3) > This test case shows that the problem is always triggered by an EPOLLHUP > event. If I change the test case to ignore EPOLLHUP, then the test always > succeeds. After the pid logic fix, making it ignore EPOLLHUP causes the test case to hang in the epoll loop. Created attachment 392416 [details]
standalone test case demonstrating that EPOLLHUP triggers the issue
This updated test case adds a non-blocking read loop, to read until EAGAIN is raised.
If I modify the test case so that the read loop executes even if the EPOLLIN flag is not set, then the test succeeds reliably! Also, if I try to swap the inner read loop for a single read, the test fails. I have a patch in the following branch: https://github.com/zmedico/portage/tree/bug_531724 I've posted it here for review: http://thread.gmane.org/gmane.linux.gentoo.portage.devel/5042 This is in the master branch now: https://github.com/gentoo/portage/commit/b3bba4dc8f4f93adeaeaf662870bb00a09bb1de7 ====================================================================== FAIL: testPipeReader (portage.tests.process.test_poll.PipeReaderPtyTestCase) ---------------------------------------------------------------------- Traceback (most recent call last): File "/var/tmp/portage/sys-apps/portage-2.2.17/work/portage-2.2.17-pypy/lib/portage/tests/__init__.py", line 222, in run testMethod() File "/var/tmp/portage/sys-apps/portage-2.2.17/work/portage-2.2.17-pypy/lib/portage/tests/process/test_poll.py", line 68, in testPipeReader "x = %s, len(output) = %s" % (x, len(output))) AssertionError: x = 8192, len(output) = 4095 ====================================================================== TODO: testAutounmaskMultilibUse (portage.tests.resolver.test_autounmask_multilib_use.AutounmaskMultilibUseTestCase) ---------------------------------------------------------------------- <bound method AutounmaskMultilibUseTestCase.testAutounmaskMultilibUse of <portage.tests.resolver.test_autounmask_multilib_use.AutounmaskMultilibUseTestCase testMethod=testAutounmaskMultilibUse>>: TODO ====================================================================== TODO: testDirectVirtualCircularDependency (portage.tests.resolver.test_circular_choices.VirtualCircularChoicesTestCase) ---------------------------------------------------------------------- <bound method VirtualCircularChoicesTestCase.testDirectVirtualCircularDependency of <portage.tests.resolver.test_circular_choices.VirtualCircularChoicesTestCase testMethod=testDirectVirtualCircularDependency>>: TODO ====================================================================== TODO: testBacktrackingGoodVersionFirst (portage.tests.resolver.test_slot_conflict_mask_update.SlotConflictMaskUpdateTestCase) ---------------------------------------------------------------------- <bound method SlotConflictMaskUpdateTestCase.testBacktrackingGoodVersionFirst of <portage.tests.resolver.test_slot_conflict_mask_update.SlotConflictMaskUpdateTestCase testMethod=testBacktrackingGoodVersionFirst>>: TODO ---------------------------------------------------------------------- Ran 226 tests in 340.179s FAILED (failures=1) Symlinking /var/tmp/portage/sys-apps/portage-2.2.17/work/portage-2.2.17-pypy/cnf -> ../portage-2.2.17/cnf Traceback (most recent call last): File "app_main.py", line 75, in run_toplevel File "setup.py", line 708, in <module> 'Topic :: System :: Installation/Setup' File "/usr/lib64/pypy/lib-python/2.7/distutils/core.py", line 151, in setup dist.run_commands() File "/usr/lib64/pypy/lib-python/2.7/distutils/dist.py", line 953, in run_commands self.run_command(cmd) File "/usr/lib64/pypy/lib-python/2.7/distutils/dist.py", line 972, in run_command cmd_obj.run() File "setup.py", line 579, in run os.path.join(self.build_lib, 'portage/tests/runTests.py') File "/usr/lib64/pypy/lib-python/2.7/subprocess.py", line 540, in check_call raise CalledProcessError(retcode, cmd) CalledProcessError: Command '['/usr/bin/pypy', '-bWd', '/var/tmp/portage/sys-apps/portage-2.2.17/work/portage-2.2.17-pypy/lib/portage/tests/runTests.py']' returned non-zero exit status 1 * ERROR: sys-apps/portage-2.2.17::gentoo failed (test phase): * (no error message) Similar but slightly different failure with pypy (In reply to Patrick Lauer from comment #11) Can you also reproduce it with the attached standalone test case? I think we should report an upstream python bug with that test case. Created attachment 396310 [details]
standalone test case (read regardless of EPOLLIN like portage-2.2.17 does)
Please use this version for testing.
Released in portage-2.2.16. Is this one still a problem? If it's still reproducible with portage-2.2.17, as reported in comment #11, then we've still got a problem. I supposed we could write an equivalent test case in C, in order to show whether or not python is at fault. *** Bug 497504 has been marked as a duplicate of this bug. *** Created attachment 397280 [details] C equivalent standalone test case (read regardless of EPOLLIN like portage-2.2.17 does) build: gcc -lutil test_read_pty.c -o test_read_pty usage: for x in {1..100} ; do ./test_read_pty || break ; done If the C implementation succeeds while the equivalent python implementation (from comment #13) fails, then we can report it as an upstream python issue. Released in portage-2.2.16 |