Summary: | sys-kernel/installkernel: improve external kernel module rebulding | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Vladimir Varlamov <bes.internal> |
Component: | Current packages | Assignee: | Distribution Kernel Project <dist-kernel> |
Status: | CONFIRMED --- | ||
Severity: | normal | CC: | Adrian.Bassett, alexander, andrewammerlaan, anton.gubarkov, bes.internal, darkdefende, gentoo, ionen, kernel, marek.bartosiewicz, n-roeser, prometheanfire, root |
Priority: | Normal | Keywords: | PullRequest |
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
See Also: |
https://bugs.gentoo.org/show_bug.cgi?id=923179 https://github.com/gentoo/gentoo/pull/35066 https://bugs.gentoo.org/show_bug.cgi?id=922225 https://bugs.gentoo.org/show_bug.cgi?id=928271 https://github.com/gentoo/gentoo/pull/36597 |
||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: | emerge --info |
(In reply to Vladimir Varlamov from comment #0) > Emerge does not complete after running the internal emerge script > /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install. Using installkernel-18 on the same kernel version (gentoo-kernel-bin-6.7.2-r1) on two separate systems I have today seen one fail in the way described, whilst the other completes without problem. 1/ The failing system is OpenRC-based ~amd64 and installkernel is installed as follows: # emerge -1pv --nodeps installkernel These are the packages that would be merged, in order: [ebuild R ] sys-kernel/installkernel-18::gentoo USE="dracut grub module-rebuild -systemd -uki -ukify" 0 KiB Interestingly, this system actually doesn't have any external modules that need re-building... (The kernel install can in fact be manually completed via a 'make install' from with /usr/src/linux.) 2/ The other system is systemd-based ~amd64 and installkernel is there installed as follows: # emerge -1pv --nodeps installkernel These are the packages that would be merged, in order: [ebuild R ] sys-kernel/installkernel-18::gentoo USE="dracut grub module-rebuild systemd -uki -ukify" 0 KiB i.e. the systemd USE flag is active. This system does have external modules that need re-building and this completes without problem. There are various 'stty: 'standard input': Inappropriate ioctl for device' lines in the log file but no infinite waits. (In reply to Adrian Bassett from comment #1) > Interestingly, this system actually doesn't have any external modules that > need re-building... But this doesn't appear to the root cause: I installed a package with an external module and tried re-emerging the kernel again, but without success. Could I get an emerge --info of both machines? It's slightly confusing that it waits for the pkg lockfile when we explicitly disable binpkgs for rebuilding the kernel modules. You can try adding "-distlocks" to the list of disabled FEATURES in the plugin, but this shouldn't be necessary. Maybe something in one of your setups is enabling binpkgs for this emerge again. > 2/ The other system is systemd-based ~amd64 and installkernel is there installed as follows: > i.e. the systemd USE flag is active. Can you reproduce the issue on your systemd machine if you use installkernel without the systemd flag. With and without the systemd flag causes a slightly different version of the plugin to be used, perhaps the problem is here. I also confirm this issue. Openrc, sys-kernel/gentoo-kernel-bin, nvidia-drivers in modules. At this moment I have removed executable flag from /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install. After installing I run emerge @module-rebuild and dracut -f. All other methods, any changes of USE-flags not working for me. [ebuild R ] sys-kernel/installkernel-18::gentoo USE="dracut grub module-rebuild -systemd -uki -ukify" 0 KiB [ebuild R ] sys-kernel/gentoo-kernel-bin-6.6.13:6.6.13::gentoo USE="initramfs (-generic-uki) -modules-compress -test" 0 KiB (In reply to Anton Kropachev from comment #5) > I also confirm this issue. emerge --info please Created attachment 883330 [details]
emerge --info
Does temporarily removing FEATURES="buildpkg-live" from your make.conf resolve your issue? (In reply to Andrew Ammerlaan from comment #8) > Does temporarily removing FEATURES="buildpkg-live" from your make.conf > resolve your issue? No. Still hangs on * waiting for lock on /var/db/.pkg.portage_lockfile ... The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=75455f3df748cb46221dcc8beadc241b4534e0fe commit 75455f3df748cb46221dcc8beadc241b4534e0fe Author: Andrew Ammerlaan <andrewammerlaan@gentoo.org> AuthorDate: 2024-01-27 21:07:23 +0000 Commit: Andrew Ammerlaan <andrewammerlaan@gentoo.org> CommitDate: 2024-01-27 21:07:53 +0000 sys-kernel/installkernel: drop USE=module-rebuild Closes: https://bugs.gentoo.org/923025 Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org> .../{installkernel-18.ebuild => installkernel-18-r1.ebuild} | 4 +--- sys-kernel/installkernel/metadata.xml | 1 - 2 files changed, 1 insertion(+), 4 deletions(-) After discussion on IRC we have decided to drop this feature. Calling emerge in emerge is a bit messy. So I'm going back to the drawing board because I still feel the 'rebuilding external modules when rebuilding the kernel' situation could be better then what we currently have with USE=dist-kernel. Actually, lets keep this open to keep track improving external module building. As I see it, USE=dist-kernel has 3 problems: - Modules are rebuilt after initramfs generation (and the current solution for zfs requires doing things twice which is suboptimal) - When a rebuild is triggered, external kernel modules are built against the eselected kernel version (i.e. /usr/src/linux) but there is no guarantee this actually matches the slot version of virtual/dist-kernel - It only works for upgrades, there are no rebuilds for downgrades. I just converted my machines to use module-rebuild so I'm a bit sad to see it go. I also had the lock issue but it can be worked around by using "FEATURES=parallel-install". Of course I understand that the solution was quite hacky (because it was running two emerge instances at the same time), but at least it was working for me. Having this run before the kernel had completed the ebuild "merge" step was useful to me as it prevented machines from updating the kernel and the boot entry, but not having the modules rebuilt. So it acted as some sort of fail save to ensure that if a update errored out or was canceled, one could assume that rebooting would not run the risk of getting the system booting into a half finished kernel upgrade. What I did in my scripts to ensure that I didn't run into the lock issue by mistakes again was to query `emerge --info` to see if parallel-install was enabled and die with an error message if this wasn't the case: if ! emerge --info | grep ^FEATURES= | grep parallel-install; then die "You need to have FEATURES=parallel-install enabled. Otherwise this post install script will deadlock." fi Perhaps something similar could be added here to bring this back? It's very much UB for the PM to call the PM, unfortunately. (In reply to Sebastian Parborg from comment #13) > Of course I understand that the solution was quite hacky (because it was > running two emerge instances at the same time), but at least it was working > for me. For what its worth, the plugin script is still in the repository[1] so if you really really want to use it you can still download it manually (just make sure to get the correct version depending on USE=+/-systemd). I have also added the FEATURES=parallel-install fix, thanks for that, so hopefully it will work properly now. [1] https://github.com/projg2/installkernel-gentoo/tree/master/hooks That all being said, as sam mentioned, emerging in an emerge is uncharted territory and for this reason it was decided to not install this plugin script in the ebuild. (In reply to Andrew Ammerlaan from comment #16) > For what its worth, the plugin script is still in the repository[1] so if > you really really want to use it you can still download it manually (just > make sure to get the correct version depending on USE=+/-systemd). I have > also added the FEATURES=parallel-install fix, thanks for that, so hopefully > it will work properly now. > Ah, I should clarify: Adding it to the scripts provided by installkernel will not work. "parallel-install" has to be enabled on the top most emerge for this to work. I tried myself to add "parallel-install" in the 30-emerge-kernel-module-rebuild.install scripts, but that will still dead lock if you only do it there. That is why I did the "if die" check as you have to inform the user to enable it when emerging the kernel. > [1] https://github.com/projg2/installkernel-gentoo/tree/master/hooks > > That all being said, as sam mentioned, emerging in an emerge is uncharted > territory and for this reason it was decided to not install this plugin > script in the ebuild. I don't disagree at all! It was a hacky solution that required the end user to abuse some portage features, no question about that. Regardless, it was really useful for me so I hope that we can come up with a better solution eventually. Perhaps one could postpone the updating the boot entries until after the last module package has successfully been emerged? (In reply to Sebastian Parborg from comment #17) > (In reply to Andrew Ammerlaan from comment #16) > > For what its worth, the plugin script is still in the repository[1] so if > > you really really want to use it you can still download it manually (just > > make sure to get the correct version depending on USE=+/-systemd). I have > > also added the FEATURES=parallel-install fix, thanks for that, so hopefully > > it will work properly now. > > > > Ah, I should clarify: > Adding it to the scripts provided by installkernel will not work. > "parallel-install" has to be enabled on the top most emerge for this to work. > I tried myself to add "parallel-install" in the > 30-emerge-kernel-module-rebuild.install scripts, but that will still dead > lock if you only do it there. > > That is why I did the "if die" check as you have to inform the user to > enable it when emerging the kernel. Right, that makes sense. At least I now understand why it worked for me, while others ran into this locking issue. > Perhaps one could postpone the updating the boot entries until after the > last module package has successfully been emerged? I had been thinking of some sort of virtual/dist-kernel-install which would do basically what kernel-install.eclass does, but is emerged after all the external modules are compiled. However the problem is: - how to ensure that it really is emerged last (without something really ugly like optionally depending on every single possible external kernel module and controlling this with a lot of USE flags) - this works only for dist-kernel's, not a huge issue since users who know how to manually configure and compile the kernel can just insert the emerge @module-rebuild between the make modules-install and make install steps. (In reply to Andrew Ammerlaan from comment #18) > - this works only for dist-kernel's, not a huge issue since users who know > how to manually configure and compile the kernel can just insert the emerge > @module-rebuild between the make modules-install and make install steps. Actually I'm using this with gentoo-sources as I use some customized kernel configs for my installs. By using the post_pkg_postinst function to configure, compile and install the new kernels I could automate nearly all updates with was nice. However I could figure out how to get the module rebuilds to work reliably. At first I just wrote a file to /tmp/ that I checked at the end of my portage update script to see if I should run emerge @module-rebuild. However this of course had the issue of incomplete kernel upgrades when something errored out or I manually canceled the update. For me calling "emerge @module-rebuild" in post_pkg_postinst would just deadlock even with parallel-install. So I'm really happy you help me to get it working! > I had been thinking of some sort of virtual/dist-kernel-install which would do > basically what kernel-install.eclass does, but is emerged after all the > external modules are compiled. Perhaps doing something similar to "emerge @preserved-rebuild" would work? IE when installing a kernel it we will not reinstall the modules or update the boot entries at the end of the ebuild. Instead we will tag all packages that needs to have their modules rebuilt and perhaps add a "update boot entry" dummy package at the end of this emerge queue. So when users update their kernel they will get a message that they need to run an additional emerge command. Of course this is not as convenient as just doing "emerge <kernel>" and having everything happen automagically in one command. But I think it might be good enough? The difference between this new @ command and @module-rebuild is that this would only populate the package list after a kernel update. @module-rebuild rebuilds all module packages regardless of any kernel updates. So if I call this when no new kernel has been installed, it would not emerge any packages nor update the boot entries. The discussion has changed a little, but as the person who created the bug, I dare to report my case.
Now I have added "parallel-install" to FEATURES.
Post script is running but failed with:
```
--- cfgpro dir /lib/modules/5.10.209-gentoo-dist/kernel/arch/x86
--- cfgpro dir /lib/modules/5.10.209-gentoo-dist/kernel/arch
--- cfgpro dir /lib/modules/5.10.209-gentoo-dist/kernel
--- cfgpro sym /lib/modules/5.10.209-gentoo-dist/build
--- cfgpro dir /lib/modules/5.10.209-gentoo-dist
--- replaced dir /lib/modules
--- replaced dir /lib
* Removing initramfs ... [ ok ]
>>> Original instance of package unmerged safely.
* Updating /usr/src/linux symlink ... [ ok ]
* Assuming you do not have a separate /boot partition.
* Installing the kernel via installkernel ...
run-parts: executing /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install 5.10.209-gentoo-dist /usr/src/linux-5.10.209-gentoo-dist/arch/x86/boot/bzImage
* Using kernel sources directory: /lib/modules/5.10.209-gentoo-dist/build
Calculating dependencies... done!
Dependency resolution took 5.77 s (backtrack: 0/20).
emerge: there are no ebuilds to satisfy "sys-kernel/gentoo-kernel-bin:5.10.203".
(dependency required by "@module-rebuild" [argument])
stty: 'standard input': Inappropriate ioctl for device
*
run-parts: /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install exited with return code 1 [ !! ]
* Installing the kernel failed
```
# eix sys-kernel/gentoo-kernel-bin
[U] sys-kernel/gentoo-kernel-bin
Available versions:
(5.10.208) 5.10.208^tu
(5.10.209) (~)5.10.209^tu
(5.15.147) [m]5.15.147^tu
(5.15.148) [m](~)5.15.148^tu
(6.1.74) [m]6.1.74^tu
(6.1.75) [m](~)6.1.75^tu
(6.6.13) [m]6.6.13^tu
(6.6.14) [m](~)6.6.14^tu
(6.7.1) [m](~)6.7.1^tu
(6.7.2) [m](~)6.7.2^tu
{generic-uki +initramfs modules-compress test}
Installed versions:
5.10.203(5.10.203)^t(13:04:16 28/12/23)(initramfs -test) 5.10.205(5.10.205)^t(16:17:12 06/01/24)(initramfs -test) 5.10.208(5.10.208)^t(18:28:03 23/01/24)(initramfs -test) 5.10.209(5.10.209)^t(19:28:02 29/01/24)(initramfs -test)
The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=b1f74da11016a3c872f250983cd5f9d06f181708 commit b1f74da11016a3c872f250983cd5f9d06f181708 Author: Andrew Ammerlaan <andrewammerlaan@gentoo.org> AuthorDate: 2024-01-29 09:58:11 +0000 Commit: Andrew Ammerlaan <andrewammerlaan@gentoo.org> CommitDate: 2024-01-30 11:08:45 +0000 linux-mod-r1.eclass: warn if KV does not match virtual/dist-kernel We have no mechanism to ensure that we build the kernel modules for the same kernel version as the version we will record in the virtual/dist-kernel subslot dependency. This does not fix this problem, but it does add a warning to ensure users are aware that, for example, built binpkgs are going to have wrong dependency metadata. Bug: https://bugs.gentoo.org/923025 Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org> Closes: https://github.com/gentoo/gentoo/pull/35066 Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org> eclass/linux-mod-r1.eclass | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) May I suggest that the USE module-rebuild should conflict with dist kernels only? the src based kernels are installed w/o any building. There is no deadlock in this case. Emerge @modules-rebuild is called during the kernel make install (i.e. there are no 2 parallel emerges) in this case. The bug has been referenced in the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f439b4ec05b1982f06f67fbf39a46ae0db187a76 commit f439b4ec05b1982f06f67fbf39a46ae0db187a76 Author: Andrew Ammerlaan <andrewammerlaan@gentoo.org> AuthorDate: 2024-05-08 06:02:53 +0000 Commit: Andrew Ammerlaan <andrewammerlaan@gentoo.org> CommitDate: 2024-05-17 12:06:42 +0000 linux-mod-r1.eclass: add USE=initramfs Adds a new variable that adds the "initramfs" flag when set. This new flag controls whether or not the modules that were built should be included in the initramfs. If the modules should be included, then we also rebuild the initramfs/uki in post_install using installkernel. Bug: https://bugs.gentoo.org/923025 Bug: https://bugs.gentoo.org/928271 Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org> eclass/linux-mod-r1.eclass | 54 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) |
Emerge does not complete after running the internal emerge script /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install. The Wiki says about this about this use flag combination, but simply as "info" about double emerge. Reproducible: Always Steps to Reproduce: 1. add USE "dist-kernel module-rebuild" to /etc/portage/make.conf 2. reemerge sys-kernel/gentoo-kernel-bin Actual Results: # emerge sys-kernel/gentoo-kernel-bin [...] >>> /lib/modules/5.10.209-gentoo-dist/modules.alias >>> /lib/modules/5.10.209-gentoo-dist/modules.order >>> /lib/modules/5.10.209-gentoo-dist/modules.builtin.modinfo >>> /lib/modules/5.10.209-gentoo-dist/modules.builtin * Updating /usr/src/linux symlink ... [ ok ] * Assuming you do not have a separate /boot partition. * Installing the kernel via installkernel ... run-parts: executing /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install 5.10.209-gentoo-dist /usr/src/linux-5.10.209-gentoo-dist/arch/x86/boot/bzImage * Using kernel sources directory: /lib/modules/5.10.209-gentoo-dist/build stty: 'standard input': Inappropriate ioctl for device * waiting for lock on /var/db/.pkg.portage_lockfile ...