Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 597390 - sys-apps/openrc (Silently fails when POSIX-File-locking) is not available/working
Summary: sys-apps/openrc (Silently fails when POSIX-File-locking) is not available/wor...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: OpenRC (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: OpenRC Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-10-17 21:40 UTC by Sven E.
Modified: 2016-11-01 22:40 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
Screenshot of failures after patch (VirtualBox_Gentoo_24_10_2016_23_55_25.png,27.02 KB, image/png)
2016-10-24 22:47 UTC, Sven E.
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Sven E. 2016-10-17 21:40:48 UTC
When openrc tries to lock files and locking fails, openrc fails to run the services and bring up the system. Unfortunately openrc will stay absolutely mute about this.

Reproducible: Always

Steps to Reproduce:
1. Try runnign openrc on a system without filelocking.
2. openrc will not start any services from sysinit/boot etc.

Actual Results:  
Failure to bring up the system and no information on what's happening

Expected Results:  
Ideally the ability to bring up the system without locking, but at least some sort of error message or information what's going wrong and hints on what to do.

Additional information will be following up
Comment 1 Sven E. 2016-10-17 22:20:42 UTC
in rc-misc.c svc_lock():

        if (flock(fd, LOCK_EX | LOCK_NB) == -1) {
                close(fd);
                return -1;
        }

Since the function is supposed to return a fd, the specific failure of locking can only be addressed directly in this function:

if (flock(fd, LOCK_EX | LOCK_NB) == -1) {
   if (errno==ENOSYS) eerror("POSIX Locking support is missing, please check your kernel config\n");
   else eerror("Locking failed with errno: %d\n", errno);
   close(fd);
   return -1;
}

(I think I read something about eerror otherwise use some other output routine)

in exec_service():

        fd = svc_lock(basename_c(service));
        if (fd == -1)
                return -1;

As one can see, if svc_lock fails, the functions returns -1, to signal an error, no further info here either.

And in rc.c do_start_services():

                pid = service_start(service->value);
                /* Remember the pid if we're running in parallel */
                if (pid > 0) {
                        add_pid(pid);
                        if (!parallel) {
                                rc_waitpid(pid);
                                remove_pid(pid);
                        }
                }
        }

The error case is not even checked when pid==-1.

Since during the calls different returns value types are in use it seems impossible to pass the error upwards.

And of course pid == -1 should be handled somehow, instead of silently ignoring it. (Other things might go wrong aswell)
Comment 2 William Hubbs gentoo-dev 2016-10-24 18:01:50 UTC
(In reply to Sven E. from comment #1)
> in rc-misc.c svc_lock():
> 
>         if (flock(fd, LOCK_EX | LOCK_NB) == -1) {
>                 close(fd);
>                 return -1;
>         }
> 
> Since the function is supposed to return a fd, the specific failure of
> locking can only be addressed directly in this function:
> 
> if (flock(fd, LOCK_EX | LOCK_NB) == -1) {
>    if (errno==ENOSYS) eerror("POSIX Locking support is missing, please check
> your kernel config\n");
>    else eerror("Locking failed with errno: %d\n", errno);
>    close(fd);
>    return -1;
> }
> 
> (I think I read something about eerror otherwise use some other output
> routine)
> 
> in exec_service():
> 
>         fd = svc_lock(basename_c(service));
>         if (fd == -1)
>                 return -1;
> 
> As one can see, if svc_lock fails, the functions returns -1, to signal an
> error, no further info here either.

I think these are both covered by making svc_lock report an error when flock fails:

https://github.com/openrc/openrc/commit/4fd144c

> And in rc.c do_start_services():
> 
>                 pid = service_start(service->value);
>                 /* Remember the pid if we're running in parallel */
>                 if (pid > 0) {
>                         add_pid(pid);
>                         if (!parallel) {
>                                 rc_waitpid(pid);
>                                 remove_pid(pid);
>                         }
>                 }
>         }
> 
> The error case is not even checked when pid==-1.
> 
> Since during the calls different returns value types are in use it seems
> impossible to pass the error upwards.
> 
> And of course pid == -1 should be handled somehow, instead of silently
> ignoring it. (Other things might go wrong aswell)

service_start() is a macro that calls exec_service(), which seems to have error messages for all cases of the pid return value. Can you confirm this?
Comment 3 Sven E. 2016-10-24 22:47:52 UTC
Created attachment 451356 [details]
Screenshot of failures after patch

I am not sure about what I should confirm, however, I think the error handling in general is a little bit broken.

If flock()ing is not supported and openrc insists on using it, the error needs to be considered fatal. No need in trying further flock()ing, spilling more message onto the screen, letting init switch runlevels etc. .

On the other hand, a service is not started, even when flock()ing fails only temporarily.

Considering the non handling of PID=-1 - As I see it, -1 is returned if either flock()ing or fork()ing fails. While there are error messages (now), if PID=-1 is a (fatal) error situation in start_service (exec_service for that matter) then apropriate error handling sould (probably) tell the user that something fatal has happened and then bail out immediately. At least for starting that would make more sense, during stopping/shutdown it seems legit to keep going, no matter what.

Just my 2 cents. Attached a screenshot from my VM for convenience.
Comment 4 Sven E. 2016-10-25 14:29:19 UTC
I currently can't push the patch in as attachment, but I am suggesting the following change:

--- openrc-0.22.2-orig/src/rc/rc.c	2016-10-06 18:55:31.000000000 +0200
+++ openrc-0.22.2-new/src/rc/rc.c	2016-10-25 16:08:27.353334440 +0200
@@ -683,6 +683,7 @@
 		}
 
 		pid = service_start(service->value);
+		if (pid == -1) break;
 		/* Remember the pid if we're running in parallel */
 		if (pid > 0) {
 			add_pid(pid);

This will cut the error messages down to 3 (one for sysinit, boot, default), as openrc gives up on all the other services in the (same) runlevel. As long as errors during lock and fork are fatal we can aswell give up, I think.
Comment 5 William Hubbs gentoo-dev 2016-11-01 22:40:54 UTC
https://github.com/openrc/openrc/commit/be06cd2

This bails if exec_service returns an error. it will be included in
0.23.

Thanks for the report and suggested changes.

William