89% tests passed, 1 tests failed out of 9 Total Test time (real) = 15.69 sec The following tests FAILED: 9 - XRootD::smoke-test (Failed) Errors while running CTest Output from these tests are in: /var/tmp/portage/net-libs/xrootd-5.6.2/work/xrootd-5.6.2_build/Testing/Temporary/LastTest.log ------------------------------------------------------------------- This is an unstable amd64 chroot image at a tinderbox (==build bot) name: 17.1_desktop_gnome_test-20230927-065005 ------------------------------------------------------------------- gcc-config -l: [1] x86_64-pc-linux-gnu-13 * clang/llvm (if any): clang version 17.0.1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/lib/llvm/17/bin Configuration file: /etc/clang/x86_64-pc-linux-gnu-clang.cfg /usr/lib/llvm/17 17.0.1 Python 3.11.5 Available Ruby profiles: [1] ruby31 (with Rubygems) * Available Rust versions: [1] rust-bin-1.72.0 [2] rust-1.72.0 * The following VMs are available for generation-2: 1) OpenJDK 11.0.20.1_p1 [openjdk-11] *) OpenJDK 17.0.8.1_p1 [openjdk-17] 3) OpenJDK 8.382_p05 [openjdk-8] 4) Eclipse Temurin JDK 11.0.20.1_p1 [openjdk-bin-11] 5) Eclipse Temurin JDK 8.382_p05 [openjdk-bin-8] Available Java Virtual Machines: [1] openjdk-8 [2] openjdk-11 [3] openjdk-17 system-vm [4] openjdk-bin-8 [5] openjdk-bin-11 php cli (if any): go version go1.21.1 linux/amd64 HEAD of ::gentoo commit ba7aac97a4d22c7fc6770de8099f72c0daa3befc Author: Repository mirror & CI <repomirrorci@gentoo.org> Date: Sun Oct 1 20:01:36 2023 +0000 2023-10-01 20:01:36 UTC emerge -qpvO net-libs/xrootd [ebuild N ] net-libs/xrootd-5.6.2 USE="libxml2 python readline server test -ceph -debug -doc -examples -fuse -http -kerberos -macaroons -scitokens -systemd -xrdec" PYTHON_TARGETS="python3_11 -python3_10 -python3_12"
Created attachment 872005 [details] emerge-info.txt
Created attachment 872006 [details] LastTest.log
Created attachment 872007 [details] emerge-history.txt
Created attachment 872008 [details] environment
Created attachment 872009 [details] etc.clang.tar.xz
Created attachment 872010 [details] etc.portage.tar.xz
Created attachment 872011 [details] logs.tar.xz
Created attachment 872012 [details] net-libs:xrootd-5.6.2:20231001-205413.log.xz
Created attachment 872013 [details] temp.tar.xz
Looks like a problem in dependencies (linking with zlib). I will have a look at it.
I cannot reproduce this on my machine. Does this happen consistently in the tinkerbox? My guess is that the server cannot read the file for which the checksum is requested for whatever reason, but I think I need your help to understand why this happens inside the tinderbox. Maybe you could run again with "export XRD_LOGLEVEL=Debug". This should give enough information to get a better idea of what went wrong.
Created attachment 872101 [details] LastTest.log
Not sure if [2023-10-03 14:17:21.210031 +0000][Error ][App ] Unable run query checksum: [ERROR] Server responded with an error: [3013] Unable to checksum /tmp/xrdfs-test-oMG06O/01.ref; operation not supported [ERROR] Server responded with an error: [3013] Unable to checksum /tmp/xrdfs-test-oMG06O/01.ref; operation not supported is relevant or not,
there's also a remote chance it's zlib-1.3 related...
The test (which I wrote :-) ) starts an XRootD local server, and tries several operations on it. What I think is happening is that a plugin which is necessary to calculate the checksum is not available, so the server is returning operation not supported. Why the plugin is not loaded is still not clear to me. For the record, here is what happens on the tinderbox(-) (excerpt from the attached log) and my machine(+), in diff format: -[Debug ][App ] Executing: query checksum /tmp/xrdfs-test-oMG06O/01.ref?cks.type=crc32c +[Debug ][App ] Executing: query checksum /tmp/xrdfs-test-YfxdPb/01.ref?cks.type=crc32c [Debug ][Utility ] Env: trying to get a non-existent string entry: pollerpreference [Debug ][Poller ] Available pollers: built-in [Debug ][Poller ] Attempting to create a poller according to preference: built-in @@ -2600,39 +2586,35 @@ [Debug ][TaskMgr ] Task manager started [Debug ][JobMgr ] Starting the job manager... [Debug ][JobMgr ] Job manager started, 3 workers -[Debug ][TaskMgr ] Registering task: "FileTimer task" to be run at: [2023-10-03 14:17:21 +0000] -[Debug ][ExDbgMsg ] [localhost:10940] MsgHandler created: 0x87d06360 (message: kXR_query (code: kXR_Qcksum, arg length: 46) ). +[Debug ][TaskMgr ] Registering task: "FileTimer task" to be run at: [2023-10-03 17:26:33 +0200] +[Debug ][ExDbgMsg ] [localhost:10940] MsgHandler created: 0xcd24b010 (message: kXR_query (code: kXR_Qcksum, arg length: 46) ). [Debug ][PostMaster ] Creating new channel to: root://localhost:10940/ [Debug ][PostMaster ] [localhost:10940] Stream parameters: Network Stack: IPAuto, Connection Window: 120, ConnectionRetry: 5, Stream Error Window: 1800 -[Debug ][TaskMgr ] Registering task: "TickGeneratorTask for: root://localhost:10940/" to be run at: [2023-10-03 14:17:36 +0000] +[Debug ][TaskMgr ] Registering task: "TickGeneratorTask for: root://localhost:10940/" to be run at: [2023-10-03 17:26:48 +0200] [Debug ][PostMaster ] [localhost:10940] Found 2 address(es): [::ffff:127.0.0.1]:10940, [::1]:10940 [Debug ][AsyncSock ] [localhost:10940.0] Attempting connection to [::1]:10940 -[Debug ][Poller ] Adding socket 0x87cc2d70 to the poller +[Debug ][Poller ] Adding socket 0xcd1fde70 to the poller [Debug ][AsyncSock ] [localhost:10940.0] Async connection call returned [Debug ][XRootDTransport ] [localhost:10940.0] Sending out the initial hand shake + kXR_protocol [Debug ][XRootDTransport ] [localhost:10940.0] Got the server hand shake response (type: server [], protocol version 511) [Debug ][XRootDTransport ] [localhost:10940.0] kXR_protocol successful (type: server [], protocol version 511) -[Debug ][XRootDTransport ] [localhost:10940.0] Sending out kXR_login request, username: portage, cgi: xrd.cc=us&xrd.tz=0&xrd.appname=xrdfs&xrd.info=&xrd.hostname=localhost&xrd.rn=v5.6.2, dual-stack: true, private IPv4: true, private IPv6: true +[Debug ][XRootDTransport ] [localhost:10940.0] Sending out kXR_login request, username: portage, cgi: xrd.cc=us&xrd.tz=1&xrd.appname=xrdfs&xrd.info=&xrd.hostname=localhost&xrd.rn=v5.6.2, dual-stack: true, private IPv4: true, private IPv6: true [Debug ][XRootDTransport ] [localhost:10940.0] Logged in, session: 1f0000002c000000180000001f000000 [Debug ][PostMaster ] [localhost:10940] Stream 0 connected (IPv6). [Debug ][Utility ] Monitor library name not set. No monitoring -[Debug ][ExDbgMsg ] [localhost:10940] Moving MsgHandler: 0x87d06360 (message: kXR_query (code: kXR_Qcksum, arg length: 46) ) from out-queu to in-queue. -[Debug ][ExDbgMsg ] [msg: 0xc4002640] Assigned MsgHandler: 0x87d06360. -[Debug ][ExDbgMsg ] [handler: 0x87d06360] Removed MsgHandler: 0x87d06360 from the in-queue. -[Debug ][XRootD ] [localhost:10940] Handling error while processing kXR_query (code: kXR_Qcksum, arg length: 46): [ERROR] Error response: operation not supported. -[Debug ][ExDbgMsg ] [localhost:10940] Calling MsgHandler: 0x87d06360 (message: kXR_query (code: kXR_Qcksum, arg length: 46) ) with status: [ERROR] Error response: operation not supported. -[Debug ][ExDbgMsg ] [localhost:10940] Destroying MsgHandler: 0x87d06360. -[Error ][App ] Unable run query checksum: [ERROR] Server responded with an error: [3013] Unable to checksum /tmp/xrdfs-test-oMG06O/01.ref; operation not supported - Server responded with an error: [3013] Unable to checksum /tmp/xrdfs-test-oMG06O/01.ref; operation not supported - - +[Debug ][ExDbgMsg ] [localhost:10940] Moving MsgHandler: 0xcd24b010 (message: kXR_query (code: kXR_Qcksum, arg length: 46) ) from out-queu to in-queue. +[Debug ][ExDbgMsg ] [msg: 0x88002790] Assigned MsgHandler: 0xcd24b010. +[Debug ][ExDbgMsg ] [msg: 0x88002790] Assigned MsgHandler: 0xcd24b010. +[Debug ][ExDbgMsg ] [handler: 0xcd24b010] Removed MsgHandler: 0xcd24b010 from the in-queue. +[Debug ][ExDbgMsg ] [localhost:10940] Calling MsgHandler: 0xcd24b010 (message: kXR_query (code: kXR_Qcksum, arg length: 46) ) with status: [SUCCESS] . +[Debug ][ExDbgMsg ] [localhost:10940] Destroying MsgHandler: 0xcd24b010. [Debug ][JobMgr ] Stopping the job manager... [Debug ][JobMgr ] Job manager stopped [Debug ][TaskMgr ] Stopping the task manager... [Debug ][TaskMgr ] Task manager stopped [Debug ][Poller ] Stopping the poller... [Debug ][AsyncSock ] [localhost:10940.0] Closing the socket -[Debug ][Poller ] <[::1]:58516><--><[::1]:10940> Removing socket from the poller +[Debug ][Poller ] <[::1]:51046><--><[::1]:10940> Removing socket from the poller [Debug ][PostMaster ] [localhost:10940] Destroying stream [Debug ][AsyncSock ] [localhost:10940.0] Closing the socket So I guess there's some strangeness going on with xrd.tz=0 vs xrd.tz=1. I will investigate this and file an upstream issue if this turns out to be a problem on our side.
Created a ticket upstream for this: https://github.com/xrootd/xrootd/issues/2096
I also ran inside a fresh container and could not reproduce this problem. There's a higher debug level with XRD_LOGLEVEL=Dump. Toralf, could you please re-run with that and also upload the log file from the server at /var/tmp/portage/net-libs/xrootd-5.6.2/work/xrootd-5.6.2_build/tests/XRootD/xrootd.log? Also please check that the timezone and clock are properly set on the machine that is running this. Is it running in a container or bare metal? Thanks for the help!
(In reply to Guilherme Amadio from comment #17) > Is it running in a container or bare metal? bar metal but bubblewrap'ed [1] [1] https://zwiebeltoralf.de/tinderbox.html
Created attachment 872135 [details] LastTest.log.xz dump
(In reply to Guilherme Amadio from comment #17) Also please check that the timezone and clock are properly set > on the machine that is running this. Is it running in a container FWIW each tinderbox image runs under a cpu group v1 - and yes, sometimes the load here is high.
Please attach also the server log as I mentioned before (tests/XRootD/xrootd.log in the build directory). Thank you!
(In reply to Guilherme Amadio from comment #21) > Please attach also the server log as I mentioned before > (tests/XRootD/xrootd.log in the build directory). Thank you! Will do (for completeness) - but FWIW I accidentally stumbled over a possible root cause: When I switch within bubblewrap from a tmpfs (/var/tmp/portage) to a persistent location then the package emerges fine.
Created attachment 872148 [details] xrootd.log dump
Created attachment 872149 [details] LastTest.log.xz
Hi Toralf, thank you very much. Indeed, the problem is that tmpfs does not support user extended file attributes (see tmpfs(5)): 231005 07:52:48 175 ofs_FAttr: Unable to get attr XrdCks.crc32c from /var/tmp/portage/net-libs/xrootd-5.6.2/work/xrootd-5.6.2_build/tests/XRootD/data/tmp/xrdfs-test-wiBUcb/01.ref; operation not supported 231005 07:52:48 175 ofs_chksum: portage.597:24@localhost Unable to checksum /tmp/xrdfs-test-wiBUcb/01.ref; operation not supported What is happening is the server is launching the checksum calculation, then reading it back from the extended attributes it sets, and that fails on tmpfs, which is expected. I think we can close this bug.
(In reply to Guilherme Amadio from comment #25) > Hi Toralf, thank you very much. Indeed, the problem is that tmpfs does not > support user extended file attributes (see tmpfs(5)): > > 231005 07:52:48 175 ofs_FAttr: Unable to get attr XrdCks.crc32c from > /var/tmp/portage/net-libs/xrootd-5.6.2/work/xrootd-5.6.2_build/tests/XRootD/ > data/tmp/xrdfs-test-wiBUcb/01.ref; operation not supported > 231005 07:52:48 175 ofs_chksum: portage.597:24@localhost Unable to checksum > /tmp/xrdfs-test-wiBUcb/01.ref; operation not supported > > What is happening is the server is launching the checksum calculation, then > reading it back from the extended attributes it sets, and that fails on > tmpfs, which is expected. I think we can close this bug. Can't you make it XFAIL or something if the filesystem is at least a known-bad one? Note that other filesystems may not support xattrs depending on kernel config and such too.
(In reply to Sam James from comment #26) > (In reply to Guilherme Amadio from comment #25) > > What is happening is the server is launching the checksum calculation, then > > reading it back from the extended attributes it sets, and that fails on > > tmpfs, which is expected. I think we can close this bug. > > Can't you make it XFAIL or something if the filesystem is at least a > known-bad one? > > Note that other filesystems may not support xattrs depending on kernel > config and such too. Not sure what you meant with XFAIL, but in any case, it does not make sense to run the server on a filesystem that does not support xattrs (see latest comments in upstream issue). What I will try to do is to check before the test runs if the filesystem supports xattrs and skip the checksum tests if it does not. In any case, there is no bug to be fixed, so I'm just closing this.
(In reply to Guilherme Amadio from comment #27) > In any case, there is no bug to be fixed, so I'm just closing this. IMO it is a bug if a test suite fails whilst it is known that a particular test should not be run. This renders the idea of running the ebuild phase "test" and trust that output being useless.
As I said, I will make the test not run if tmpfs is detected. I will just fix this upstream and make another release next week. Sorry if that was not clear.
FYI, this has been addressed upstream with the commit below: https://github.com/xrootd/xrootd/commit/8eff8a4ccd2bfc5abb6a9fd215d43235eecd95f1 Cheers,