Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 703278 - sys-apps/portage pid-sandbox causes qemu-user failure launch process in PID namespace (qemu_thread_create: Invalid argument)
Summary: sys-apps/portage pid-sandbox causes qemu-user failure launch process in PID n...
Status: CONFIRMED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core (show other bugs)
Hardware: All Linux
: Normal major with 1 vote (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-12-18 10:46 UTC by anonymous
Modified: 2023-09-15 18:51 UTC (History)
10 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description anonymous 2019-12-18 10:46:38 UTC
If pid-sandbox is enabled in FEATURES, emerge fails in qemu-aarch64 chroot.

pid-sandbox feature was turned on by default in commit 55a9d4ccc5ac90b454638f9205f8a5d20ca8b47a

Reproducible: Always

Steps to Reproduce:
1. Just try to build something with emerge in qemu-aarch64 chroot.
Actual Results:  
qemu: qemu_thread_create: Invalid argument
...
...
>>> Failed to emerge xxxx


https://bugs.launchpad.net/qemu/+bug/1829459
https://forums.gentoo.org/viewtopic-t-1092058.html
Comment 1 Mike Gilbert gentoo-dev 2019-12-18 16:49:57 UTC
I would guess qemu-user does not support PID namespaces, so there is nothing we can do about it in Portage.
Comment 2 anonymous 2021-10-22 13:53:43 UTC
I stopped relying on cross compilation except for bootstrapping linux kernel for an ARM single board computer. I got tired of the complexity of cross compilation.

Simplicity is the highest form of sophistication. I just compile locally.

Let's say it's obsolete.
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2021-10-22 15:16:14 UTC
The issue does still exist though.
Comment 4 onkobu 2022-03-07 19:45:05 UTC
(In reply to crocket from comment #2)
> I stopped relying on cross compilation except for bootstrapping linux kernel
> for an ARM single board computer. I got tired of the complexity of cross
> compilation.
So much for your personal opinion. I'll add that there isn't that much complexity from my point of view. Only a crossdev environment per target system and qemu or distcc.

> 
> Simplicity is the highest form of sophistication. I just compile locally.
What does »locally« mean? I assume a single core ARM single board computer like a Raspberry Pi Zero literally takes weeks to compile webkits, Qt or clang/llvm/rust stack.

> 
> Let's say it's obsolete.
Fully disagree. From your opinion to a general decision lacks arguments. So it would also be valid if I suggest to switch to Buildroot.

Regarding the message: also on Armv7 with 32bit.
Comment 5 onkobu 2022-03-07 19:50:36 UTC
Upstream no progress: https://gitlab.com/qemu-project/qemu/-/issues/172
Comment 6 anonymous 2022-03-08 02:46:38 UTC
If you want to tackle this personally, feel free to do so.
Personally, I just decided to wait until I can buy an ARM computer with powerful CPUs and at least 16GB RAM. Basically, something like an ARM desktop computer.
Comment 7 Alec Warner (RETIRED) archtester gentoo-dev Security 2022-03-08 02:52:14 UTC
Isn't the workaround simply to turn the PID sandbox off? Is that not working?
Comment 8 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-03-08 16:01:23 UTC
(In reply to Alec Warner from comment #7)
> Isn't the workaround simply to turn the PID sandbox off? Is that not working?

It works fine here and always has to just turn it off in such chroots.
Comment 9 Mike Gilbert gentoo-dev 2022-03-08 18:31:03 UTC
I think we can work around this in Portage, so assigning it back to us.

The failure seems to be trigged when a process running under qemu-user calls unshare(CLONE_NEWPID), followed by execve. This causes qemu-user to spawn a new thread, which returns EINVAL.

I think we can avoid that by not calling execve in the same process as unshare. We currently do that here:

https://gitweb.gentoo.org/proj/portage.git/tree/lib/portage/process.py?h=portage-3.0.30#n782

We could probably import the relevant code from pid-ns-init and execute it directly.
Comment 10 onkobu 2022-03-10 21:26:32 UTC
Sounds like a plan. I could test a patch or edit process.py directly…even without any Python knowledge. (Had to look up execve which replaces the current process with the process derived from its arguments. Now it makes sense also to me. And I am full of respect for the tamers of this strange abstraction.)

Am I right that pid-ns-init at 782 (where the call is prepared) is used with len(argv) == 2 thus skips the large else block? Is it also right that pid-ns-init isn't called from anywhere else except the two occurrences in process.py? Does it justify some refactoring then leaving only the else in pid-ns-init active? Also the conditional signal forwarding with setsid is unnecessary when not == 2?
Comment 11 anonymous 2022-07-28 00:40:09 UTC
https://gitlab.com/qemu-project/qemu/-/issues/172 should fix this.
Comment 12 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-07-28 00:46:15 UTC
(In reply to crocket from comment #11)
> https://gitlab.com/qemu-project/qemu/-/issues/172 should fix this.

Upstream clearly think it's for us to do and the upstream bug isn't fixed. If you don't want emails about this bug anymore, tick the box on the bug.

(In reply to onkobu from comment #10)
> Sounds like a plan. I could test a patch or edit process.py directly…even
> without any Python knowledge. (Had to look up execve which replaces the
> current process with the process derived from its arguments. Now it makes
> sense also to me. And I am full of respect for the tamers of this strange
> abstraction.)
> 

That would be really helpful, thank you.

> Am I right that pid-ns-init at 782 (where the call is prepared) is used with
> len(argv) == 2 thus skips the large else block? Is it also right that
> pid-ns-init isn't called from anywhere else except the two occurrences in
> process.py? Does it justify some refactoring then leaving only the else in
> pid-ns-init active? Also the conditional signal forwarding with setsid is
> unnecessary when not == 2?


1. I see it invoked with one argument and then further on:

root     1658111  0.0  0.0  17824 10856 pts/15   SN+  01:45   0:00  |           \_ /usr/bin/python3.11 /var/tmp/portage/._portage_reinstall_.5kwgew0j/bin/pid-ns-init 1658115
root     1658115  0.0  0.0  17828 11372 pts/18   SNs+ 01:45   0:00  |           |   \_ /usr/bin/python3.11 /var/tmp/portage/._portage_reinstall_.5kwgew0j/bin/pid-ns-init     0,1,2 /bin/bash [sys-apps/portage-9999] bash -c /var/tmp/portage/._portage_reinstall_.5kwgew0j/bin/ebuild.sh pretend

2. Yes.

3. I'd need to look into why it got added in the first place.