Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 356393 - sys-apps/openrc savecache/mount-ro interaction
Summary: sys-apps/openrc savecache/mount-ro interaction
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] baselayout (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: OpenRC Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-25 09:36 UTC by Duncan
Modified: 2011-03-06 18:03 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Duncan 2011-02-25 09:36:33 UTC
There's currently a problem with the interaction between the savecache and mount-ro services in the shutdown runlevel.  (FWIW, openrc-0.7.0 but the problem has been around for awhile and the 6.3 binpkg that's the earliest I still have around appears to have the same issues.)

1) mount-ro "needs" savecache, as savecache obviously won't work after mount-ro does its thing.

2) Problem: If savecache fails for whatever reason (see #3), mount-ro fails to run because it "needs" savecache.  This is unfortunate, as the shutdown-integrity of the root partition as provided by mount-ro is arguably *WAY* more important than caching (or failing to cache) a few seconds worth of dependency calculation info.

3) Trigger: The trigger I see here is when I reboot within seven hours after updating a system service (as in rebooting to check that I can still boot after the updates), as the system is set to UTC-7 (America/Phoenix), triggering the clock-skew warning, which causes save-cache to exit with an error.  However, the problem would occur with anything that caused savecache to fail, since mount-ro "needs" it.

4) Solution: The simplest solution is probably to have savecache unconditionally return 0, regardless of whether it could actually savecache or not.  Then
mount-ro would run and do its critcial mount-ro thing regardless of what savecache had done.

Another possible option might be to put a before: mount-ro dependency in savecache and remove the needs: savecache in mount-ro (or simply make the needs an after), but I'm not sure if that really does what is wanted, or not, as I've had trouble with service dependency nuances before and specifically make no claim on having them straight, now.

(I'm setting this as blocking the stable-tracker, also, as it opens up anyone running a localzone hardware clock, among others, to potential data loss on the rootfilesystem, a "feature" stable shouldn't have to deal with.  Should be reasonably easy to fix since unlike some of the others this one's all openrc, and it looks straightforward enough, so I doubt it'll be blocking the stabilization for long. =:^)
Comment 1 William Hubbs gentoo-dev 2011-02-25 20:46:24 UTC
Roy,

Can you give me your advice on this bug?

Commit 3d3700 sets up the savecache service so that it warns about clock skews. However, it also sets things up so that a clock skew causes the cache to not be saved and causes the savecache service to return failure by default.

Should the savecache service return a failure in this situation?

Also, I see that there is a variable set up so that the user could configure whether or not the cache is saved if the clock is skewed, but the variable itself is not set to a default value in /etc/conf.d/savecache, and there is no documentation for it anywhere.

Should we set up a /etc/conf.d/savecache and document this feature, or make the service always save the cache?
Comment 2 Roy Marples 2011-02-25 22:47:56 UTC
It's a two fold problem

1) I firmly believe that if a service fails, it should return a failure
2) Some services need to stay up if a dependant fails stop but equally some services need to go down in a critical state like so.

Now, this is only a *real* problem when it critical that the system moves from one state to another (up -> shutdown). It's a pain moving from default -> foo, but not (or should not be) critical.

I would add a toggle to each runlevel allowing it to ignore failed services at stop so that others can continue.

Comment 3 William Hubbs gentoo-dev 2011-02-26 01:15:18 UTC
(In reply to comment #2)
> It's a two fold problem
> 1) I firmly believe that if a service fails, it should return a failure
> 2) Some services need to stay up if a dependant fails stop but equally some
> services need to go down in a critical state like so.
> Now, this is only a *real* problem when it critical that the system moves from
> one state to another (up -> shutdown). It's a pain moving from default -> foo,
> but not (or should not be) critical.
> I would add a toggle to each runlevel allowing it to ignore failed services at
> stop so that others can continue.

I'm not sure I'm following you. The issue here isn't to do with stopping services, but starting them.

mount-ro's depend function has:

need killprocs savecache

Killprocs always returns success, so there isn't an issue with it. Savecache though, returns failure if it doesn't save the cache due to clock skew, and if that happens, mount-ro doesn't get executed because it has a need dependency  on savecache, and this could be a bad situation during shutdown.

If you look at line 11 of savecache.in, whether or not it returns a failure is controlled by a conditional that will never be false.

Here are my questions:

1) Why do we have a conditional that is never false? Were you going to make that user configurable at some point?
2) Is the need dependency in mount-ro too strong for this situation? Should it possibly be after instead?
Comment 4 William Hubbs gentoo-dev 2011-03-06 18:03:11 UTC
I spoke with our base-system lead about this, and the approach we came up with was to have savecache always report success when the system is in the process of shutting down. I implemented this in git, commit 8730248.