Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 868306 - sys-libs/pam-1.5.2-r1 leaves kernel boot process unable to load most modules
Summary: sys-libs/pam-1.5.2-r1 leaves kernel boot process unable to load most modules
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: Normal major (vote)
Assignee: Mikle Kolyada (RETIRED)
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 867589
  Show dependency tree
 
Reported: 2022-09-03 23:05 UTC by Sam Handel
Modified: 2024-02-06 22:41 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
emerge --info from affected system (lt1_emerge_info.out,6.20 KB, text/plain)
2022-09-03 23:05 UTC, Sam Handel
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sam Handel 2022-09-03 23:05:06 UTC
Created attachment 803077 [details]
emerge --info from affected system

Prior version of package on working system is sys-libs/pam-1.5.1_p20210622-r1
Upgrade attempts to upgrade to sys-libs/pam-1.5.2-r1

After upgrade system startup fails with only some tasks completed. Most kernel modules are not loaded so dependent services fail. Local mounts succeed.

System is openrc with elogind used. Profile is amd64 17.1

Problem replicated on two different systems. Attached emerge --info shown is from test system.

Process to replicate:
1. Bring system current with emerge @world excepting sys-libs/pam-1.5.2-r1
2. Verify system reboots and starts up properly.
3  Backup system 
3. emerge =sys-libs/pam-1.5.2-r1
4. Verify system reboots with failed services due to unloaded kernel modules.

At this point a restore of the system is required in order to proceed with another test.
Comment 1 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-09-03 23:06:25 UTC
It shouldn't touch kernel module loading at all. I assume some service fails which then cascades. Which one originally fails? Can you record it booting up or something?
Comment 2 Sam Handel 2022-09-03 23:09:37 UTC
(In reply to Sam James from comment #1)
> It shouldn't touch kernel module loading at all. I assume some service fails
> which then cascades. Which one originally fails? Can you record it booting
> up or something?

Thanks for the quick response. I will attempt to capture more info. It may take me a little while to return with results.
Comment 3 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-09-03 23:11:22 UTC
(In reply to Sam Handel from comment #2)
> (In reply to Sam James from comment #1)
> > It shouldn't touch kernel module loading at all. I assume some service fails
> > which then cascades. Which one originally fails? Can you record it booting
> > up or something?
> 
> Thanks for the quick response. I will attempt to capture more info. It may
> take me a little while to return with results.

No worries. It may be easier to troubleshoot a bit on IRC (libera.chat, #gentoo). Happy to try there if you want.
Comment 4 dwfreed 2022-09-03 23:23:00 UTC
Note that you can use rc_logger in /etc/rc.conf to log the openrc boot process.
Comment 5 Sam Handel 2022-09-04 00:38:14 UTC
Here's the deal.

During boot starting the udev service fails as libpam.so.0 is not found. 

after the boot is as complete as it gets, I manually start udev and udev-trigger which then succeeds. Modules load. I can piece together the rest of the startup manually.

Prior to the sys-lib/pam-1.5.2-r1 upgrade, libpam.so.0 is in /lib64. After the upgrade libpam.so.0 is found in /usr/lib64.

This is an openrc system, not systemd. since /usr is not mounted during sysinit runlevel the libpam.so.0 cannot be found.

I think what has happened here is that the location for libpam.so.0 changed with the sys-libs/pam-1.5.2-r1 upgrade.

Is there any way to change the upgrade behavior to use /lib64 instead of /usr/lib64 ? systemd is not an option ( big rebuild headaches, this /usr dependency is one reason I choose to not use it ).
Comment 6 Sam Handel 2022-09-04 01:03:19 UTC
To confirm the /ib64 vs. /usr/lib64 issue I copied libpam* files/links from /usr/lib64 to /lib64. The system boots normally.

This seems like a terrible workaround and I don't trust it.
Comment 7 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-09-04 01:04:18 UTC
(In reply to Sam Handel from comment #5)
> Here's the deal.
> 
> During boot starting the udev service fails as libpam.so.0 is not found. 
> 
> after the boot is as complete as it gets, I manually start udev and
> udev-trigger which then succeeds. Modules load. I can piece together the
> rest of the startup manually.
> 
> Prior to the sys-lib/pam-1.5.2-r1 upgrade, libpam.so.0 is in /lib64. After
> the upgrade libpam.so.0 is found in /usr/lib64.
> 
> This is an openrc system, not systemd. since /usr is not mounted during
> sysinit runlevel the libpam.so.0 cannot be found.
> 
> I think what has happened here is that the location for libpam.so.0 changed
> with the sys-libs/pam-1.5.2-r1 upgrade.
> 
> Is there any way to change the upgrade behavior to use /lib64 instead of
> /usr/lib64 ? systemd is not an option ( big rebuild headaches, this /usr
> dependency is one reason I choose to not use it ).

Your system is split-usr? This is barely considered supported at this point, but it shouldn't get dropped uncerimoniously either.

If you do want to continue using split-usr *without* an initramfs, I'd suggest getting involved with bugs like bug 443590 to avoid more things like this happening.

I barely remember it now, but I remember doing a bit of a fixup:
```
commit aeb526aa3b0875745fa0af6c754ded21af68658b
Author: Sam James <sam@gentoo.org>
Date:   Sat Nov 6 02:28:55 2021 +0000

    sys-libs/pam: drop usrscript

    This shouldn't be necessary anymore but let's do it in a new revision
    in ~arch to be safe.

    See: 2ff9dcc3275e4f37a44eaf707fce9f53c13c2e82
    Signed-off-by: Sam James <sam@gentoo.org>

commit 2ff9dcc3275e4f37a44eaf707fce9f53c13c2e82
Author: Mikle Kolyada <zlogene@gentoo.org>
Date:   Fri Nov 5 21:50:59 2021 +0300

    sys-libs/pam: rop usr-ldscript

    Package-Manager: Portage-3.0.20, Repoman-3.0.3

    Signed-off-by: Mikle Kolyada <zlogene@gentoo.org>
```

Really, dropping it from PAM is tantamount to dropping it entirely, hence it's not something we can just *do* without a news item and proper warning.

I'll sort it now.
Comment 8 Larry the Git Cow gentoo-dev 2022-09-04 01:11:11 UTC
The bug has been closed via the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=313ec7f2df4e9ee5560f9bedd739223633e405b2

commit 313ec7f2df4e9ee5560f9bedd739223633e405b2
Author:     Sam James <sam@gentoo.org>
AuthorDate: 2022-09-04 01:08:15 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2022-09-04 01:08:15 +0000

    sys-libs/pam: [QA] restore split-usr
    
    While split-usr support remains tenuous, dropping it from
    PAM is tantamount to removing it from Gentoo entirely and
    requires something more like a news item and a lot of
    planning.
    
    Also, really, the resultant ebuild cleanup from
    dropping it doesn't justify the gratuitous breakage:
    cost & reward.
    
    That said, I would strongly recommend at this
    point that split-usr users use an initramfs
    or actively participate in helping to solve
    split-usr bugs (see e.g. bug 443590) as at
    some point, the dam is going to break and
    maintainers may get fed up. It's already
    a barely-supported situation.
    
    Obligatory: none of this has anything
    to do with "merged /usr".
    
    Bug: https://bugs.gentoo.org/443590
    Closes: https://bugs.gentoo.org/868306
    See: 2ff9dcc3275e4f37a44eaf707fce9f53c13c2e82
    See: aeb526aa3b0875745fa0af6c754ded21af68658b
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-libs/pam/{pam-1.5.2-r1.ebuild => pam-1.5.2-r2.ebuild} | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
Comment 9 Sam Handel 2022-09-04 02:20:37 UTC
I don't recall how the split-usr thing is supposed to work, I will research further. You say that it is "barely supported", what does that imply?

Just to add to my deviance, I also am booting without an initramfs by linking the sata disk driver and ext4 into the kernel image rather than loading them as modules. Not using initramfs is an old habit, I've done this since sometime in the early 1990s. The gentoo writeup on early mounting of /usr uses initramfs. I don't fully understand it yet.

I've also considered just moving /usr into my root partition, I currently have /usr mounted on / as a separate file system. I'm not certain this file system consolidation would resolve any issues.

So far, my options seem to be:
1. Copy files ( so far only libpam* ) from /usr/lib64 to /lib64. Yechh, sounds made to be broken down the road.
2. Implement the scheme for early mount of /usr which seems to require initramfs.
3. Consolidate my / and /usr file system partitions into a single / file system partition and continue to boot without initramfs.

Any thoughts on these options?

Thanks for your outstanding response and insights.
Comment 10 Sam James archtester Gentoo Infrastructure gentoo-dev Security 2022-09-04 02:57:45 UTC
(In reply to Sam Handel from comment #9)
> I don't recall how the split-usr thing is supposed to work, I will research
> further. You say that it is "barely supported", what does that imply?
> 
> Just to add to my deviance, I also am booting without an initramfs by
> linking the sata disk driver and ext4 into the kernel image rather than
> loading them as modules. Not using initramfs is an old habit, I've done this
> since sometime in the early 1990s. The gentoo writeup on early mounting of
> /usr uses initramfs. I don't fully understand it yet.

Okay, so, the whole thing revolves around "ebuilds placing enough binaries and libraries in / to allow mounting /usr on a separate partition".

The problem is, very few people (relatively) test with this configuration, and I'm not sure there's any Gentoo developers who have such a setup now. And while people don't seek to break this configuration, it's very *very* easy to do so accidentally.

Imagine say, a coreutils program (which an init script needed to mount /usr calls) gets a new dependency on a pretty common system library, like dev-libs/gmp. The maintainer will surely notice
that and update the ebuild, fine. But who verifies that *gmp's* libraries are then also in /, ready
for the binary to call?

(This actually happened: bug 398053, but the point is, this can happen with anything, and has happened plenty of times).

The only people who notice such a problem are the relatively small pool of split-/usr users. And they only notice when it's too late usually.

One thing we _could_ do is implement bug 443590 which would check if binaries installed in /bin (and also if libraries in /lib*) have their dependencies also in / rather than /usr. But nobody's worked on that.

Something which muddies the water a little bit (and it could be revisited in theory) is that supposedly the policy right now is maintainers are free to drop split-usr support from their ebuilds at will and aren't "forced" to worry about split-usr. But that's really true for anything and as long as users give patches for odd or obscure use cases, usually the maintainer can be persuaded. I don't see a reason to view split-usr as different there.

(There is also https://freedesktop.org/wiki/Software/systemd/separate-usr-is-broken/ which covers some of the issues with a separate /usr, but it's not really the core of my point which is far more fundamental about libraries being straight-up missing).

> 
> I've also considered just moving /usr into my root partition, I currently
> have /usr mounted on / as a separate file system. I'm not certain this file
> system consolidation would resolve any issues.
> 
> So far, my options seem to be:
> 1. Copy files ( so far only libpam* ) from /usr/lib64 to /lib64. Yechh,
> sounds made to be broken down the road.

If you have to do this, something's gone wrong. The ebuild
should have split-usr support (we have an eclass for it: usr-ldscript, and as
you can see from the commit to restore it to pam, it was trivial pretty much).

> 2. Implement the scheme for early mount of /usr which seems to require
> initramfs.

This would be what I'd do, as it should be far less invasive than rejigging your
whole system, and would ensure reliability.

> 3. Consolidate my / and /usr file system partitions into a single / file
> system partition and continue to boot without initramfs.

I personally do this on new installs, but I wouldn't want to suggest
you uproot your whole install to do it now.

> 
> Any thoughts on these options?
> 
> Thanks for your outstanding response and insights.

Thank you for your patience and understanding. I'm sorry for the frustration this will have caused.
Comment 11 Sam Handel 2022-09-04 03:43:43 UTC
Thanks so much for your time and engagement. 

I'm leaning towards the consolidation of / and /usr filesystems into a single / filesystem, as it can be done in place with minimal changes to my existing installations, backups, etc.  I have been using ext2 or ext4 on the / filesystem, f2fs on /usr, and mounting /usr read only, but I'm not deeply committed to these practices, they are more "nice to have" if anything. I have unintentionally run with f2fs on /, it seems to work fine.

Our discussion has set me off on a great deal of reading, it looks like I missed out on some hearty religious wars a few years back ;-D

I need to take it all in and sort it out but for now I think I'll set up my test system here with the partition consolidation approach.

Once again, thanks for your guidance, you fast tracked me into understanding the architectural issues. Enjoy the rest of your weekend.
Comment 12 Larry the Git Cow gentoo-dev 2024-01-05 04:10:41 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/data/gentoo-news.git/commit/?id=114a15884faf88f202073de48812613b264f49e0

commit 114a15884faf88f202073de48812613b264f49e0
Author:     Eli Schwartz <eschwartz93@gmail.com>
AuthorDate: 2024-01-02 04:04:32 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2024-01-05 04:10:37 +0000

    2024-01-05-usr-initramfs: add news item
    
    Revival of commit a79dd69b0cca439bc0c483c9193c79e0554819d0.
    
    Bug: https://bugs.gentoo.org/868306#c10
    Bug: https://bugs.gentoo.org/902829
    Bug: https://bugs.gentoo.org/915379
    Bug: https://bugs.gentoo.org/825078
    Signed-off-by: Eli Schwartz <eschwartz93@gmail.com>
    Signed-off-by: Sam James <sam@gentoo.org>

 .../2024-01-05-usr-initramfs.en.txt                | 46 ++++++++++++++++++++++
 1 file changed, 46 insertions(+)
Comment 13 Larry the Git Cow gentoo-dev 2024-02-06 22:41:10 UTC
The bug has been referenced in the following commit(s):

https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=72d0c560a13563ebd6e7b010cc5ab169fb2efc8b

commit 72d0c560a13563ebd6e7b010cc5ab169fb2efc8b
Author:     Eli Schwartz <eschwartz93@gmail.com>
AuthorDate: 2024-02-06 04:50:41 +0000
Commit:     Sam James <sam@gentoo.org>
CommitDate: 2024-02-06 22:40:58 +0000

    sys-libs/pam: remove usr-ldscript support
    
    Per news item 2024-01-05-usr-initramfs, we no longer support this use
    case. It is fragile and hacky and leads to bizarre forms of load errors.
    
    The functionality is, despite being called "split-usr", not really about
    split-usr at all.
    
    [sam: add bug #868306 ref.]
    
    Bug: https://bugs.gentoo.org/825078
    Bug: https://bugs.gentoo.org/825758
    Bug: https://bugs.gentoo.org/868306
    Signed-off-by: Eli Schwartz <eschwartz93@gmail.com>
    Signed-off-by: Sam James <sam@gentoo.org>

 sys-libs/pam/pam-1.5.3-r1.ebuild | 153 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)