The updated ebuild for this package installed a new version of the xenstore init script which contains this fragment of code: until xenstore-exists / do echo -n . sleep .5 done While this is fine when booting with xen it results in an infinite loop in the init script when booting without xen. Reproducible: Always Steps to Reproduce: 1. Install xen-tools 2. Boot system without Xen 3. Wait forever to get a login prompt
thx Spooky. I think I missed by way of a single . on applying a patch in a prior bug. Try with sleep 5 not sleep .5 20 Jul 2013; Ian Delaney <idella4@gentoo.org> +files/xen-4.3-fix_dotconfig-gcc.patch, +files/xen-4.3.0-anti-download.patch, +xen-tools-4.3.0.ebuild, files/xen-4-fix_dotconfig-gcc.patch, files/xen-4.2.0-anti-download.patch, files/xenstored.initd: bump; required culling sec patches, edit to configure and upgrading 2 patches, correction to xenstored.initd wrt Bug #476572 by Spooky Ghost
The length of the sleep is not important. When you do not boot with the xen hypervisor, i.e. bare-metal Linux, then xenstore does not start. In this case you are stuck within this until statement when start() is called as xenstore-exists will never be true which prevents a boot of the system.
Perhaps in start() you need something like: is_privileged_domain() { grep -qsE '^control_d$' /proc/xen/capabilities return $? } start() { ... if is_privileged_domain ; then until xenstore-exists / do ... done else echo "Xenstore cannot be started, not a Xen dom0" return 1 fi ... }
(In reply to Spooky Ghost from comment #0) > > Steps to Reproduce: > 1. Install xen-tools > 2. Boot system without Xen > 3. Wait forever to get a login prompt While I can confirm that there *is* a potential for problems here, I cannot resist pointing out that the above steps do not reproduce the infinite loop behavior. For that to occur, one needs to also perform the 'shoot-myself-in-the-foot' step of rc-update add xenstored default Let's just say this is not exactly the smartest thing to do if one intends to boot both with and without xen, but I'd certainly agree that fixing an initscript is much easier than fixing PEBCAK issues. Just for the record, I'd recommend the following approach (that also works with the current initscript): mkdir /etc/runlevels/xen rc-update -s default xen rc-update add xenstored xen Then, append 'softlevel=xen' to the linux kernel command line for the entry corresponding to booting *with* Xen in the bootloader configuration.. Now, having gotten that out of my system, let's return to fixing the initscript.. I'm no Xen guru; so, I could easily be wrong, but I thought that to support "disaggregation" xenstored could also be run in a non-privileged domain, in which case checking for dom0 may not be desirable. Nevertheless, the error I'm getting from xenstored when starting non-Xen *is* about 'privileged command interface'; so, let's just keep the dom0 assumption. Then, I'd prefer to check for the existence of /proc/xen/privcmd (which *is* the "privileged command interface" libxc is trying to open [see bove]) instead of peeking inside /proc/xen/capabilities. I'd also place the check as the very first thing in 'start()' to avoid unnecessary work and to get rid of that ugly error message mentioned above. I'll attach a proposed patch.
Created attachment 354086 [details, diff] bail out if not in dom0
Please ignore this patch. I've just gotten around to testing it in a domU, and -of course- /proc/xen/privcmd exists there as well. I'm thinking that making the waiting conditional on that xenstored proper had started successfully is probably most fool-proof (also accounting for the brain-dead corner cases of disabling xenfs or the /sys/hypervisor entries in the kernel configuration). The only problem is that xenstored has 0 exit status even on error. (--somewhat surprisingly, considering that 'barf' ends with exit(1); but that's hit _after_ daemonize with a few exit(0) calls..) So, kludgily parsing xenstored's standard error would seem necessary. Not a nice option. Perhaps the easiest fix is to not wait more than 15 seconds total, which falls back to a behaviour largely equivalent to the original sleep 15 command. Patch coming soon..
Created attachment 354158 [details, diff] Restore 'sleep 15' behaviour on non-dom0 systems Alternatively, the dom0 check based on /proc/xem/capabilites proposed by Sppoky Ghost could also be considered.. --- /etc/init.d/xenstored.orig 2013-07-24 14:10:12.678000000 +0200 +++ /etc/init.d/xenstored 2013-07-24 14:12:29.456000000 +0200 @@ -8,6 +8,7 @@ } start() { + [ -r /proc/xen/capabilities ] && [ 'control_d' != "$(</proc/xen/capabilities)" && ewarn "Not a Xen dom0." && return 1 ! [ -x /run/xen ] && mkdir -p /run/xen ebegin "Starting xenstored daemon" start-stop-daemon --start --exec /usr/sbin/xenstored \
(In reply to Al Mighty from comment #4) > While I can confirm that there *is* a potential for problems here, > I cannot resist pointing out that the above steps do not reproduce > the infinite loop behavior. For that to occur, one needs to also > perform the 'shoot-myself-in-the-foot' step of > Big Al, I like it, the metaphor and turn of phrase in general. > rc-update add xenstored default > So Spooky, thou shouldst unset rc-update xenstored default, it seems > Let's just say this is not exactly the smartest thing to do if one > intends to boot both with and without xen > Let's > Now, having gotten that out of my system, > let's return to fixing the initscript.. > (In reply to Al Mighty from comment #7) > Created attachment 354158 [details, diff] [details, diff] > Restore 'sleep 15' behaviour on non-dom0 systems > > Alternatively, the dom0 check based on /proc/xem/capabilites > proposed by Sppoky Ghost could also be considered.. > > --- /etc/init.d/xenstored.orig 2013-07-24 14:10:12.678000000 +0200 > +++ /etc/init.d/xenstored 2013-07-24 14:12:29.456000000 +0200 > @@ -8,6 +8,7 @@ > } > > start() { > + [ -r /proc/xen/capabilities ] && [ 'control_d' != > "$(</proc/xen/capabilities)" ]; && ewarn "Not a Xen dom0." && return 1 > ! [ -x /run/xen ] && mkdir -p /run/xen > ebegin "Starting xenstored daemon" > start-stop-daemon --start --exec /usr/sbin/xenstored \ The luxury of choice. Firstly, Spooky ghost can you report as to whether you actually did the rc-update add xenstored xen or not. Secondly, once verified, it seems the eternal loop is an unfortunate accident that occurred because it could. Thirdly, while I find the || ((15 < ++i)) somewhat equiv. to what we started with (a hard coded sleep 15), I'll go for humouring bug Al and try it, EXCEPT I believe "/ || ((15 < ++i))" need be "|| ((15 < ++i)) \". So; 25 Jul 2013; Ian Delaney <idella4@gentoo.org> files/xenstored.initd: Edit to xenstored.initd wrt Bug #476572 & #475204 Let's get some test runs and reports.
(In reply to Ian Delaney from comment #8) > Thirdly, while I find the || ((15 < ++i)) somewhat equiv. to what we > started with (a hard coded sleep 15), I'll go for humouring bug Al and try > it, EXCEPT I believe "/ || ((15 < ++i))" need be "|| ((15 < ++i)) \". Actually, no. Try this tokenization 'xenstore-exists /' || '((15 < ++i))'. The xenstore-exists command wants a path argument and '/' was shortest… (-; BTW, the result is pretty ugly: 17+ error messages and a 15-second wait. ----- 8< ----- 8< ----- 8< ----- 8< ----- 8< ----- 8< ----- # /etc/init.d/xenstored start * Caching service dependencies ... [ ok ] * Starting xenstored daemon ... xc: error: Could not obtain handle on privileged command interface (2 = No such file or directory): Internal error FATAL: Failed to open connection to hypervisor: No such file or directory xenstore-exists: couldn't start transaction .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory .xenstore-exists: xs_open: No such file or directory * Setting domain0 name record xenstore-write: xs_open: No such file or directory [ !! ] * ERROR: xenstored failed to start ----- 8< ----- 8< ----- 8< ----- 8< ----- 8< ----- 8< ----- On the other hand, this "only" inconveniences those that run /etc/init.d/xenstored on non-dom0 systems. (-; Doing a dom0-check first would be esthetically much more pleasing...
(In reply to Al Mighty from comment #9) > (In reply to Ian Delaney from comment #8) > > Thirdly, while I find the || ((15 < ++i)) somewhat equiv. to what we > > started with (a hard coded sleep 15), I'll go for humouring bug Al and try > > it, EXCEPT I believe "/ || ((15 < ++i))" need be "|| ((15 < ++i)) \". > > Actually, no. Try this tokenization 'xenstore-exists /' || '((15 < ++i))'. > The xenstore-exists command wants a path argument and '/' was shortest… (-; > Hmm Big Al, you're right. However, can you answer these 2? 1. the result is pretty ugly: 17+ error messages and a 15-second wait. Are these non-fatal and merely ugly? 2. Your alternative does need a cleanup. I'd go for; if [ -r /proc/xen/capabilities ] && [[ 'control_d' != "$(</proc/xen/capabilities)" ]]; then echo "Not a Xen dom0." return 1 fi a) 'control_d' != warrants the double brackets since the values are being compared, and you missed closing the single bracket @ ')" && ewarn' b) ewarn is an ebuild portage wrapper not a bashism, so it ought be plain echo. I won't change it for now seeing you appear on-line now and it needs changing again at any rate
Had to commit a change for https://bugs.gentoo.org/show_bug.cgi?id=478064 anyway so change of mind, corrected the loop
(In reply to Ian Delaney from comment #10) > 1. the result is pretty ugly: 17+ error messages and a 15-second wait. Are > these non-fatal and merely ugly? In the sanse that nothing actually breaks, they are non-fatal. Although xenstored insists on printing "FATAL", as you can see in Comment#9. > 2. Your alternative does need a cleanup. I'd go for; > if [ -r /proc/xen/capabilities ] && [[ 'control_d' != > "$(</proc/xen/capabilities)" ]]; then > echo "Not a Xen dom0." > return 1 > fi Sure, that looks more proper. > a) 'control_d' != warrants the double brackets since the values are being > compared, and you missed closing the single bracket @ ')" && ewarn' I don't think you need double brackets. (No 'pattern mathing' going on, just plain string compare). My mistake was adding a ';' after ']' before '&& ewarn'. > b) ewarn is an ebuild portage wrapper not a bashism, so it ought be plain > echo. I put 'ewarn' on purpose (to get the fancy colour warning ;-)), but runscript will report that xenstored failed to start, anyway; so, a plain echo will also work.
portage wants the POSIX compliant, however let's await Spook Ghost to reply and test the other if needed.
I'm not that well-versed in POSIX; so, this all could be wrong, but for compliance, the following _may_ also need to be changed: ((15 < i)) to [ 15 -lt $i ] and $(</proc/xen/capabilities) to $(cat /proc/xen/capabilities) As for the ugly error messages, adding 'exec 2> /dev/null' as the first statement in 'start()' will make them go away, (-; except that one might miss other important error messages in other situations.. I'd certainly opt for redirecting stderr for the 'xenstore-exist /' command, though (i.e. change from the current version to 'until xenstore-exists / 2> /dev/null || [ 15 -lt $i ]').
Created attachment 354722 [details, diff] Typos fixed, redirection added
(In reply to Al Mighty from comment #15) > Created attachment 354722 [details, diff] [details, diff] > Typos fixed, redirection added hasufell can you run the adjusted patch here thru your POSIX checker and inform us if any edit required?
> hasufell can you run the adjusted patch here thru your POSIX checker and > inform us if any edit required? seems I needed to Save first with updated CC list before bothering to add this
I believe this problem is already fixed in later versions.. try 4.2.4, or 4.3.x closed, and feel free to re-open if still have problem, thanks