Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 911977 - app-containers/lxc-5.0.2[io-uring]: terminal sessions stall with sys-libs/liburing
Summary: app-containers/lxc-5.0.2[io-uring]: terminal sessions stall with sys-libs/lib...
Status: UNCONFIRMED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: AMD64 Linux
: Normal critical (vote)
Assignee: Joonas Niilola
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-09 17:30 UTC by zen
Modified: 2023-08-22 06:21 UTC (History)
3 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
Emerge --info, and gdb data, in order (data.txt,40.58 KB, text/plain)
2023-08-12 14:03 UTC, zen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description zen 2023-08-09 17:30:56 UTC
With that USE flag, LXC, when init'd to something like bash, will stall, showing:
read(5, 0x7ffd4fc96a20, 1024)           = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
in strace

Reproducible: Always

Steps to Reproduce:
1. Install lxc with io-uring enabled
2. Import the stage3 tarball using 'https://wiki.gentoo.org/wiki/LXC#Local_template (
  touch config 
  tar -cJf metadata.tar.xz config
  lxc-create -t local -n gentoo-guest -- --fstree stage3.tar.xz --metadata metadata.tar.xz
3. Set 'lxc.init.cmd = /bin/bash' in the container config (/var/lib/lxc/gentoo-guest/config)
4. run "lxc-start gentoo-guest -F"
5. run a few commands until it stalls
Actual Results:  
Bash eventually stalls, strace says read(5, 0x7ffd4fc96a20, 1024)           = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-08-09 17:32:17 UTC
Please give a gdb backtrace when stuck (gdb -p PID, then ^C, then bt).

Also, do older versions work, and emerge --info please?
Comment 2 zen 2023-08-11 13:03:55 UTC
I just tested on my desktop and it happens on there too, System info:
https://bpa.st/WWYQ

Original system info:
https://bpa.st/7A3A

I rebuilt with debug syms:
https://bpa.st/KDDA

This was reproduced on my desktop, I don't think it matters much, but my desktop is using the dist kernel, unlike the original system
Comment 3 zen 2023-08-11 13:56:59 UTC
(In reply to zen from comment #2)
> I just tested on my desktop and it happens on there too, System info:
> https://bpa.st/WWYQ
> 
> Original system info:
> https://bpa.st/7A3A
> 
> I rebuilt with debug syms:
> https://bpa.st/KDDA
> 
> This was reproduced on my desktop, I don't think it matters much, but my
> desktop is using the dist kernel, unlike the original system

Another BT, the first one stalled shortly after i attached, but resumed after i detached GDB. I've found opening nano reliably breaks it, so I opened that, then attached and ran BT.

https://bpa.st/BT4A
Comment 4 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2023-08-12 04:52:32 UTC
Please always attach data, no pastebins, as they expire.
Comment 5 zen 2023-08-12 14:03:37 UTC
Created attachment 867719 [details]
Emerge --info, and gdb data, in order
Comment 6 Joonas Niilola gentoo-dev 2023-08-13 07:35:38 UTC
*sigh* not io-uring again. Would you happen to have any older kernels lying around to test with?
Comment 7 zen 2023-08-13 14:04:24 UTC
(In reply to Joonas Niilola from comment #6)
> *sigh* not io-uring again. Would you happen to have any older kernels lying
> around to test with?

I do not, is there a specific version you have in mind? It takes around 10m per reboot on that first system, and I use it as a router, so would prefer to not be offline most of the time. The second system is my desktop and audio doesn't work below 6.3 (onboard audio).
Comment 8 Joonas Niilola gentoo-dev 2023-08-22 06:20:33 UTC
So upstream did have some issues with io-uring, apparently they're supposed to be fixed in 6.8.4 which you're running. Did the issues start happening after an update to 6.8.4? 

can't really say what kernel version to test, apparently the issue is present in 6.1-LTS too. Fixed in 6.1.45. So if only you _can_, testing e.g. 6.1.41 and 6.1.45 would be valuable. But I guess if you connect the issues starting after 6.4, or 6.4.8, that's a hint too.