Summary: | net-fs/nfs-utils nfs fails to stop when running 6.6 kernels | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Vjaceslavs Klimovs <vklimovs> |
Component: | Current packages | Assignee: | Gentoo's Team for Core System packages <base-system> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | digitalaudiorock, eschwartz93, hydrapolic, lvd.mhm, vklimovs |
Priority: | Normal | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: | Emerge info |
Description
Vjaceslavs Klimovs
2024-02-11 20:31:58 UTC
Could you try strace the stuck nfsd process and also get a gdb backtrace from it? (Attach gdb with gdb -p, then ^C, then bt)? Just to add to this one: What he's reporting is exactly the same as what I reported in bug 924178, which for reasons I don't really understand, was resolved as a duplicate of bug 916947. The only notable difference is that, as I stated in my original bug, is that I'm running an old version of openrc (0.17). I already explained at https://forums.gentoo.org/viewtopic-p-8816463.html#8816463. There's no interest in debugging old OpenRC versions where they've changed relevant code. Tom, Can you explain why you are using an ancient and broken version of openrc that is guaranteed to not work? Because if you can't reproduce the issue with current versions of openrc then your system is broken and nfs-utils is not. As such, you're distracting and confusing the very real attempts to debug a real problem by the issue reporter for this bug report, which is totally unrelated to your issue. So the nfsd init script does this in stop(): # nfsd sets its process name to [nfsd] so don't look for $nfsd ebegin "Stopping NFS daemon" start-stop-daemon --stop --name nfsd --user root --signal 2 eend $? ret=$((ret + $?)) # in case things don't work out ... #228127 rpc.nfsd 0 I think that start-stop-daemon call is sending SIGINT to kernel nfsd threads. Since this Linux commit, we cannot signal nfsd kernel threads directly: https://github.com/torvalds/linux/commit/3903902401451b1cd9d797a8c79769eb26ac7fe5 I think we should just update the init script to stop sending SIGINT via start-stop-daemon and just jump directly to calling rpc.nfsd 0. (In reply to Mike Gilbert from comment #5) Sounds good. The bug has been closed via the following commit(s): https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7aa183ae8073593cab6d3f012a981a6e6712ffc2 commit 7aa183ae8073593cab6d3f012a981a6e6712ffc2 Author: Mike Gilbert <floppym@gentoo.org> AuthorDate: 2024-02-16 19:23:48 +0000 Commit: Mike Gilbert <floppym@gentoo.org> CommitDate: 2024-02-16 19:32:49 +0000 net-fs/nfs-utils: stop sending signals to kernel nfsd threads Closes: https://bugs.gentoo.org/924309 Signed-off-by: Mike Gilbert <floppym@gentoo.org> net-fs/nfs-utils/files/nfs.initd | 10 ++++------ .../{nfs-utils-2.6.3-r2.ebuild => nfs-utils-2.6.3-r3.ebuild} | 2 +- .../{nfs-utils-2.6.4-r3.ebuild => nfs-utils-2.6.4-r10.ebuild} | 0 .../{nfs-utils-2.6.4-r1.ebuild => nfs-utils-2.6.4-r4.ebuild} | 0 4 files changed, 5 insertions(+), 7 deletions(-) *** Bug 924178 has been marked as a duplicate of this bug. *** *** Bug 920816 has been marked as a duplicate of this bug. *** (In reply to Eli Schwartz from comment #4) > Tom, > > Can you explain why you are using an ancient and broken version of openrc > that is guaranteed to not work? > > Because if you can't reproduce the issue with current versions of openrc > then your system is broken and nfs-utils is not. > > As such, you're distracting and confusing the very real attempts to debug a > real problem by the issue reporter for this bug report, which is totally > unrelated to your issue. Well...That's sort of moot, as I actually just updatedopenrc and rebooted: equery list openrc * Searching for openrc ... [IP-] [ ] sys-apps/openrc-0.53:0 ...and still have the issue: /etc/init.d/nfs stop * Stopping NFS mountd ... * start-stop-daemon: no matching processes found [ ok ] * Stopping NFS daemon ... * start-stop-daemon: no matching processes found [ ok ] * Unexporting NFS directories ... Also note that with both the service start and the failed stop I notice that the init script continues to run afterwards for as much as a minute or more: ps auxw| grep nfs root 117 0.0 0.0 0 0 ? I< 09:43 0:00 [kworker/R-nfsio] root 3239 0.0 0.0 7912 2360 pts/4 S 09:51 0:00 /bin/sh /lib/rc/sh/openrc-run.sh /etc/init.d/nfs stop root 3243 0.0 0.0 6332 2176 pts/4 S+ 09:51 0:00 grep --colour=auto nfs Tom Created attachment 885225 [details]
Emerge info
So I just updated to that new net-fs/nfs-utils-2.6.4-r10 from my overlay: equery list nfs-utils * Searching for nfs-utils ... [I-O] [ ] net-fs/nfs-utils-2.6.4-r10:0 However the service stiff fails to start: /etc/init.d/nfs stop * Stopping NFS mountd ... [ ok ] * Stopping NFS daemon ... * start-stop-daemon: 8 process(es) refused to stop [ !! ] * Unexporting NFS directories ... [ ok ] * ERROR: nfs failed to stop As you see I've posted an attachment with my emerge --info. Tom Never mind. I clearly did that upgrade all wrong as all I got was the ebuild. I will try that correctly. Question...and I've run into this before: If I go to that commit: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=7aa183ae8073593cab6d3f012a981a6e6712ffc2 How to I get a raw unified diff? I want to get those patches and for the life of me I can't figure out how. Wow...OK. I was able to apply that commit patch to my existing nfs-utils-2.6.4-r3.ebuild and the service stop is still failing: /etc/init.d/nfs stop * Stopping NFS mountd ... [ ok ] * Stopping NFS daemon ... * start-stop-daemon: 8 process(es) refused to stop [ !! ] * Unexporting NFS directories ... [ ok ] * ERROR: nfs failed to stop Tom Is there a reason to not just emerge --sync and update normally? Anyway, please check what the contents of the init script are. OK...Sorry for all the news. I finally got that init script to patch correctly, and the stop DOES work now. Thanks!!...and sorry for the confusion. Tom Excellent! Thanks! Thank you! |