Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 923025

Summary: sys-kernel/installkernel: improve external kernel module rebulding
Product: Gentoo Linux Reporter: Vladimir Varlamov <bes.internal>
Component: Current packagesAssignee: Distribution Kernel Project <dist-kernel>
Status: CONFIRMED ---    
Severity: normal CC: Adrian.Bassett, alexander, andrewammerlaan, anton.gubarkov, bes.internal, darkdefende, gentoo, ionen, kernel, marek.bartosiewicz, n-roeser, prometheanfire, root
Priority: Normal Keywords: PullRequest
Version: unspecified   
Hardware: All   
OS: Linux   
See Also: https://bugs.gentoo.org/show_bug.cgi?id=923179
https://github.com/gentoo/gentoo/pull/35066
https://bugs.gentoo.org/show_bug.cgi?id=922225
https://bugs.gentoo.org/show_bug.cgi?id=928271
https://github.com/gentoo/gentoo/pull/36597
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: emerge --info

Description Vladimir Varlamov 2024-01-27 13:25:18 UTC
Emerge does not complete after running the internal emerge script /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install.
The Wiki says about this about this use flag combination, but simply as "info" about double emerge.

Reproducible: Always

Steps to Reproduce:
1. add USE "dist-kernel module-rebuild" to /etc/portage/make.conf 
2. reemerge sys-kernel/gentoo-kernel-bin

Actual Results:  
# emerge sys-kernel/gentoo-kernel-bin
[...]
>>> /lib/modules/5.10.209-gentoo-dist/modules.alias
>>> /lib/modules/5.10.209-gentoo-dist/modules.order
>>> /lib/modules/5.10.209-gentoo-dist/modules.builtin.modinfo
>>> /lib/modules/5.10.209-gentoo-dist/modules.builtin
 * Updating /usr/src/linux symlink ...                                                                                                                                                     [ ok ]
 * Assuming you do not have a separate /boot partition.
 * Installing the kernel via installkernel ...
run-parts: executing /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install 5.10.209-gentoo-dist /usr/src/linux-5.10.209-gentoo-dist/arch/x86/boot/bzImage
 * Using kernel sources directory: /lib/modules/5.10.209-gentoo-dist/build
stty: 'standard input': Inappropriate ioctl for device
 * waiting for lock on /var/db/.pkg.portage_lockfile ...
Comment 1 Adrian Bassett 2024-01-27 17:19:16 UTC
(In reply to Vladimir Varlamov from comment #0)
> Emerge does not complete after running the internal emerge script
> /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install.
Using installkernel-18 on the same kernel version  
(gentoo-kernel-bin-6.7.2-r1) on two separate systems I have today seen one fail in the way described, whilst the other completes without problem.

1/ The failing system is OpenRC-based ~amd64 and installkernel is installed as follows:

# emerge -1pv --nodeps installkernel

These are the packages that would be merged, in order:

[ebuild   R    ] sys-kernel/installkernel-18::gentoo  USE="dracut grub module-rebuild -systemd -uki -ukify" 0 KiB

Interestingly, this system actually doesn't have any external modules that need re-building...

(The kernel install can in fact be manually completed via a 'make install' from with /usr/src/linux.)

2/ The other system is systemd-based ~amd64 and installkernel is there installed as follows:

# emerge -1pv --nodeps installkernel

These are the packages that would be merged, in order:

[ebuild   R    ] sys-kernel/installkernel-18::gentoo  USE="dracut grub module-rebuild systemd -uki -ukify" 0 KiB

i.e. the systemd USE flag is active.

This system does have external modules that need re-building and this completes without problem.

There are various 'stty: 'standard input': Inappropriate ioctl for device' lines in the log file but no infinite waits.
Comment 2 Adrian Bassett 2024-01-27 18:04:16 UTC
(In reply to Adrian Bassett from comment #1)

> Interestingly, this system actually doesn't have any external modules that
> need re-building...
But this doesn't appear to the root cause:  I installed a package with an external module and tried re-emerging the kernel again, but without success.
Comment 3 Andrew Nowa Ammerlaan gentoo-dev 2024-01-27 19:10:50 UTC
Could I get an emerge --info of both machines?

It's slightly confusing that it waits for the pkg lockfile when we explicitly disable binpkgs for rebuilding the kernel modules.

You can try adding "-distlocks" to the list of disabled FEATURES in the plugin, but this shouldn't be necessary. Maybe something in one of your setups is enabling binpkgs for this emerge again.
Comment 4 Andrew Nowa Ammerlaan gentoo-dev 2024-01-27 19:48:02 UTC
> 2/ The other system is systemd-based ~amd64 and installkernel is there installed as follows:

> i.e. the systemd USE flag is active.

Can you reproduce the issue on your systemd machine if you use installkernel without the systemd flag. 

With and without the systemd flag causes a slightly different version of the plugin to be used, perhaps the problem is here.
Comment 5 Anton Kropachev 2024-01-27 20:21:07 UTC
I also confirm this issue.

Openrc, sys-kernel/gentoo-kernel-bin, nvidia-drivers in modules.

At this moment I have removed executable flag from /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install. After installing I run emerge @module-rebuild and dracut -f.

All other methods, any changes of USE-flags not working for me.

[ebuild   R    ] sys-kernel/installkernel-18::gentoo  USE="dracut grub module-rebuild -systemd -uki -ukify" 0 KiB
[ebuild   R    ] sys-kernel/gentoo-kernel-bin-6.6.13:6.6.13::gentoo  USE="initramfs (-generic-uki) -modules-compress -test" 0 KiB
Comment 6 Andrew Nowa Ammerlaan gentoo-dev 2024-01-27 20:22:06 UTC
(In reply to Anton Kropachev from comment #5)
> I also confirm this issue.

emerge --info please
Comment 7 Anton Kropachev 2024-01-27 20:24:29 UTC
Created attachment 883330 [details]
emerge --info
Comment 8 Andrew Nowa Ammerlaan gentoo-dev 2024-01-27 20:27:21 UTC
Does temporarily removing FEATURES="buildpkg-live" from your make.conf resolve your issue?
Comment 9 Anton Kropachev 2024-01-27 20:37:45 UTC
(In reply to Andrew Ammerlaan from comment #8)
> Does temporarily removing FEATURES="buildpkg-live" from your make.conf
> resolve your issue?

No. 
Still hangs on  * waiting for lock on /var/db/.pkg.portage_lockfile ...
Comment 10 Larry the Git Cow gentoo-dev 2024-01-27 21:08:27 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=75455f3df748cb46221dcc8beadc241b4534e0fe

commit 75455f3df748cb46221dcc8beadc241b4534e0fe
Author:     Andrew Ammerlaan <andrewammerlaan@gentoo.org>
AuthorDate: 2024-01-27 21:07:23 +0000
Commit:     Andrew Ammerlaan <andrewammerlaan@gentoo.org>
CommitDate: 2024-01-27 21:07:53 +0000

    sys-kernel/installkernel: drop USE=module-rebuild
    
    Closes: https://bugs.gentoo.org/923025
    Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>

 .../{installkernel-18.ebuild => installkernel-18-r1.ebuild}           | 4 +---
 sys-kernel/installkernel/metadata.xml                                 | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)
Comment 11 Andrew Nowa Ammerlaan gentoo-dev 2024-01-27 21:10:51 UTC
After discussion on IRC we have decided to drop this feature. Calling emerge in emerge is a bit messy.

So I'm going back to the drawing board because I still feel the 'rebuilding external modules when rebuilding the kernel' situation could be better then what we currently have with USE=dist-kernel.
Comment 12 Andrew Nowa Ammerlaan gentoo-dev 2024-01-27 21:15:00 UTC
Actually, lets keep this open to keep track improving external module building.

As I see it, USE=dist-kernel has 3 problems:
- Modules are rebuilt after initramfs generation (and the current solution for zfs requires doing things twice which is suboptimal)
- When a rebuild is triggered, external kernel modules are built against the eselected kernel version (i.e. /usr/src/linux) but there is no guarantee this actually matches the slot version of virtual/dist-kernel
- It only works for upgrades, there are no rebuilds for downgrades.
Comment 13 Sebastian Parborg 2024-01-29 13:22:05 UTC
I just converted my machines to use module-rebuild so I'm a bit sad to see it go.
I also had the lock issue but it can be worked around by using "FEATURES=parallel-install".

Of course I understand that the solution was quite hacky (because it was running two emerge instances at the same time), but at least it was working for me.

Having this run before the kernel had completed the ebuild "merge" step was useful to me as it prevented machines from updating the kernel and the boot entry, but not having the modules rebuilt.
So it acted as some sort of fail save to ensure that if a update errored out or was canceled, one could assume that rebooting would not run the risk of getting the system booting into a half finished kernel upgrade.
Comment 14 Sebastian Parborg 2024-01-29 13:29:42 UTC
What I did in my scripts to ensure that I didn't run into the lock issue by mistakes again was to query `emerge --info` to see if parallel-install was enabled and die with an error message if this wasn't the case:

if ! emerge --info | grep ^FEATURES= | grep parallel-install; then
      die "You need to have FEATURES=parallel-install enabled. Otherwise this post install script will deadlock."
fi

Perhaps something similar could be added here to bring this back?
Comment 15 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2024-01-29 13:31:22 UTC
It's very much UB for the PM to call the PM, unfortunately.
Comment 16 Andrew Nowa Ammerlaan gentoo-dev 2024-01-29 13:35:58 UTC
(In reply to Sebastian Parborg from comment #13)
> Of course I understand that the solution was quite hacky (because it was
> running two emerge instances at the same time), but at least it was working
> for me.

For what its worth, the plugin script is still in the repository[1] so if you really really want to use it you can still download it manually (just make sure to get the correct version depending on USE=+/-systemd). I have also added the FEATURES=parallel-install fix, thanks for that, so hopefully it will work properly now.

[1] https://github.com/projg2/installkernel-gentoo/tree/master/hooks

That all being said, as sam mentioned, emerging in an emerge is uncharted territory and for this reason it was decided to not install this plugin script in the ebuild.
Comment 17 Sebastian Parborg 2024-01-29 13:49:49 UTC
(In reply to Andrew Ammerlaan from comment #16)
> For what its worth, the plugin script is still in the repository[1] so if
> you really really want to use it you can still download it manually (just
> make sure to get the correct version depending on USE=+/-systemd). I have
> also added the FEATURES=parallel-install fix, thanks for that, so hopefully
> it will work properly now.
> 

Ah, I should clarify:
Adding it to the scripts provided by installkernel will not work.
"parallel-install" has to be enabled on the top most emerge for this to work.
I tried myself to add "parallel-install" in the 30-emerge-kernel-module-rebuild.install scripts, but that will still dead lock if you only do it there.

That is why I did the "if die" check as you have to inform the user to enable it when emerging the kernel.

> [1] https://github.com/projg2/installkernel-gentoo/tree/master/hooks
> 
> That all being said, as sam mentioned, emerging in an emerge is uncharted
> territory and for this reason it was decided to not install this plugin
> script in the ebuild.

I don't disagree at all! It was a hacky solution that required the end user to abuse some portage features, no question about that.
Regardless, it was really useful for me so I hope that we can come up with a better solution eventually.

Perhaps one could postpone the updating the boot entries until after the last module package has successfully been emerged?
Comment 18 Andrew Nowa Ammerlaan gentoo-dev 2024-01-29 13:59:36 UTC
(In reply to Sebastian Parborg from comment #17)
> (In reply to Andrew Ammerlaan from comment #16)
> > For what its worth, the plugin script is still in the repository[1] so if
> > you really really want to use it you can still download it manually (just
> > make sure to get the correct version depending on USE=+/-systemd). I have
> > also added the FEATURES=parallel-install fix, thanks for that, so hopefully
> > it will work properly now.
> > 
> 
> Ah, I should clarify:
> Adding it to the scripts provided by installkernel will not work.
> "parallel-install" has to be enabled on the top most emerge for this to work.
> I tried myself to add "parallel-install" in the
> 30-emerge-kernel-module-rebuild.install scripts, but that will still dead
> lock if you only do it there.
> 
> That is why I did the "if die" check as you have to inform the user to
> enable it when emerging the kernel.

Right, that makes sense. At least I now understand why it worked for me, while others ran into this locking issue.
 

> Perhaps one could postpone the updating the boot entries until after the
> last module package has successfully been emerged?

I had been thinking of some sort of virtual/dist-kernel-install which would do basically what kernel-install.eclass does, but is emerged after all the external modules are compiled. However the problem is:
- how to ensure that it really is emerged last (without something really ugly like optionally depending on every single possible external kernel module and controlling this with a lot of USE flags)
- this works only for dist-kernel's, not a huge issue since users who know how to manually configure and compile the kernel can just insert the emerge @module-rebuild between the make modules-install and make install steps.
Comment 19 Sebastian Parborg 2024-01-29 15:06:46 UTC
(In reply to Andrew Ammerlaan from comment #18)
> - this works only for dist-kernel's, not a huge issue since users who know
> how to manually configure and compile the kernel can just insert the emerge
> @module-rebuild between the make modules-install and make install steps.

Actually I'm using this with gentoo-sources as I use some customized kernel configs for my installs. By using the post_pkg_postinst function to configure, compile and install the new kernels I could automate nearly all updates with was nice. However I could figure out how to get the module rebuilds to work reliably.

At first I just wrote a file to /tmp/ that I checked at the end of my portage update script to see if I should run emerge @module-rebuild. However this of course had the issue of incomplete kernel upgrades when something errored out or I manually canceled the update.

For me calling "emerge @module-rebuild" in post_pkg_postinst would just deadlock even with parallel-install. So I'm really happy you help me to get it working!

> I had been thinking of some sort of virtual/dist-kernel-install which would do
> basically what kernel-install.eclass does, but is emerged after all the
> external modules are compiled.

Perhaps doing something similar to "emerge @preserved-rebuild" would work?
IE when installing a kernel it we will not reinstall the modules or update the boot entries at the end of the ebuild. Instead we will tag all packages that needs to have their modules rebuilt and perhaps add a "update boot entry" dummy package at the end of this emerge queue.

So when users update their kernel they will get a message that they need to run an additional emerge command.

Of course this is not as convenient as just doing "emerge <kernel>" and having everything happen automagically in one command. But I think it might be good enough?

The difference between this new @ command and @module-rebuild is that this would only populate the package list after a kernel update. @module-rebuild rebuilds all module packages regardless of any kernel updates. So if I call this when no new kernel has been installed, it would not emerge any packages nor update the boot entries.
Comment 20 Vladimir Varlamov 2024-01-29 16:49:37 UTC
The discussion has changed a little, but as the person who created the bug, I dare to report my case.
Now I have added "parallel-install" to FEATURES.
Post script is running but failed with:

```
--- cfgpro   dir /lib/modules/5.10.209-gentoo-dist/kernel/arch/x86
--- cfgpro   dir /lib/modules/5.10.209-gentoo-dist/kernel/arch
--- cfgpro   dir /lib/modules/5.10.209-gentoo-dist/kernel
--- cfgpro   sym /lib/modules/5.10.209-gentoo-dist/build
--- cfgpro   dir /lib/modules/5.10.209-gentoo-dist
--- replaced dir /lib/modules
--- replaced dir /lib
 * Removing initramfs ...                                                                                                                                                                               [ ok ]
>>> Original instance of package unmerged safely.
 * Updating /usr/src/linux symlink ...                                                                                                                                                                  [ ok ]
 * Assuming you do not have a separate /boot partition.
 * Installing the kernel via installkernel ...
run-parts: executing /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install 5.10.209-gentoo-dist /usr/src/linux-5.10.209-gentoo-dist/arch/x86/boot/bzImage
 * Using kernel sources directory: /lib/modules/5.10.209-gentoo-dist/build
Calculating dependencies... done!
Dependency resolution took 5.77 s (backtrack: 0/20).

emerge: there are no ebuilds to satisfy "sys-kernel/gentoo-kernel-bin:5.10.203".

(dependency required by "@module-rebuild" [argument])
stty: 'standard input': Inappropriate ioctl for device
 * 
run-parts: /etc/kernel/preinst.d/30-emerge-kernel-module-rebuild.install exited with return code 1                                                                                                      [ !! ]
 * Installing the kernel failed
```

# eix sys-kernel/gentoo-kernel-bin
[U] sys-kernel/gentoo-kernel-bin
     Available versions:  
     (5.10.208) 5.10.208^tu
     (5.10.209) (~)5.10.209^tu
     (5.15.147) [m]5.15.147^tu
     (5.15.148) [m](~)5.15.148^tu
     (6.1.74) [m]6.1.74^tu
     (6.1.75) [m](~)6.1.75^tu
     (6.6.13) [m]6.6.13^tu
     (6.6.14) [m](~)6.6.14^tu
     (6.7.1) [m](~)6.7.1^tu
     (6.7.2) [m](~)6.7.2^tu
       {generic-uki +initramfs modules-compress test}
     Installed versions:  
5.10.203(5.10.203)^t(13:04:16 28/12/23)(initramfs -test) 5.10.205(5.10.205)^t(16:17:12 06/01/24)(initramfs -test) 5.10.208(5.10.208)^t(18:28:03 23/01/24)(initramfs -test) 5.10.209(5.10.209)^t(19:28:02 29/01/24)(initramfs -test)
Comment 21 Larry the Git Cow gentoo-dev 2024-01-30 11:09:11 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=b1f74da11016a3c872f250983cd5f9d06f181708

commit b1f74da11016a3c872f250983cd5f9d06f181708
Author:     Andrew Ammerlaan <andrewammerlaan@gentoo.org>
AuthorDate: 2024-01-29 09:58:11 +0000
Commit:     Andrew Ammerlaan <andrewammerlaan@gentoo.org>
CommitDate: 2024-01-30 11:08:45 +0000

    linux-mod-r1.eclass: warn if KV does not match virtual/dist-kernel
    
    We have no mechanism to ensure that we build the kernel modules for
    the same kernel version as the version we will record in the virtual/dist-kernel
    subslot dependency. This does not fix this problem, but it does add a warning
    to ensure users are aware that, for example, built binpkgs are going to have
    wrong dependency metadata.
    
    Bug: https://bugs.gentoo.org/923025
    Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>
    Closes: https://github.com/gentoo/gentoo/pull/35066
    Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>

 eclass/linux-mod-r1.eclass | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)
Comment 22 Anton Gubarkov 2024-02-11 10:21:22 UTC
May I suggest that the USE module-rebuild should conflict with dist kernels only? 
the src based kernels are installed w/o any building. There is no deadlock in this case.

Emerge @modules-rebuild is called during the kernel make install (i.e. there are no 2 parallel emerges) in this case.
Comment 23 Larry the Git Cow gentoo-dev 2024-05-17 12:07:29 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=f439b4ec05b1982f06f67fbf39a46ae0db187a76

commit f439b4ec05b1982f06f67fbf39a46ae0db187a76
Author:     Andrew Ammerlaan <andrewammerlaan@gentoo.org>
AuthorDate: 2024-05-08 06:02:53 +0000
Commit:     Andrew Ammerlaan <andrewammerlaan@gentoo.org>
CommitDate: 2024-05-17 12:06:42 +0000

    linux-mod-r1.eclass: add USE=initramfs
    
    Adds a new variable that adds the "initramfs" flag when set. This new
    flag controls whether or not the modules that were built should be
    included in the initramfs. If the modules should be included, then we
    also rebuild the initramfs/uki in post_install using installkernel.
    
    Bug: https://bugs.gentoo.org/923025
    Bug: https://bugs.gentoo.org/928271
    Signed-off-by: Andrew Ammerlaan <andrewammerlaan@gentoo.org>

 eclass/linux-mod-r1.eclass | 54 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)