Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 157547 - sys-process/cronbase: run-crons: any job can starve global lock indefinitely preventing any new job from starting
Summary: sys-process/cronbase: run-crons: any job can starve global lock indefinitely ...
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Other
: Highest critical with 1 vote (vote)
Assignee: Cron Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 555756 169449
  Show dependency tree
 
Reported: 2006-12-08 16:18 UTC by Radoslaw Stachowiak (RETIRED)
Modified: 2015-07-24 05:53 UTC (History)
6 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
First draft of patch for per-base lockfiles (run-crons-0.3.3.patch,2.96 KB, patch)
2011-07-18 23:22 UTC, Dan Wallis
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Radoslaw Stachowiak (RETIRED) gentoo-dev 2006-12-08 16:18:24 UTC
01:11 <@radek> looks like that our /usr/sbin/run-crons does locking in a way that if for example anything from
               cron.daily/* runs longer than 60 minutes, then subsequent cron.hourly/* scripts are not being run.
01:12 <@radek> in other words, as long as cron.daily runs, no other cron.hourly (weekly or montlhy) will be run. this
               applies to all of them.
01:12 <@radek> should we consider it a bug?
01:12 <@radek> or feature? or maybe my conclusion is wrong?
01:12 <@beu> i would say that this is definitely bug worthy.
01:13 <@radek> for me it is. it's caused by fact that locking is global for whole run-crons while it should be
               separated for all classes (hourly,weekly, etc).
Comment 1 Radoslaw Stachowiak (RETIRED) gentoo-dev 2007-12-25 22:14:11 UTC
Guys, any progress on it ?
Comment 2 SpanKY gentoo-dev 2007-12-27 21:31:15 UTC
i had rewritten it on my machine when you bugged me on irc, but my src hard drive crashed taking the changes with it :/
Comment 3 Radoslaw Stachowiak (RETIRED) gentoo-dev 2009-01-25 00:37:34 UTC
Ouch, I missed two year anniversary ;-)
Comment 4 Thilo Bangert (RETIRED) (RETIRED) gentoo-dev 2009-04-13 20:43:31 UTC
vapier: could you outline your new implementation? maybe i'll get around to code it...
Comment 5 Wolfram Schlich (RETIRED) gentoo-dev 2009-11-13 13:06:44 UTC
ping
Comment 6 Radoslaw Stachowiak (RETIRED) gentoo-dev 2009-12-23 16:40:11 UTC
Guys, this is a CRITICAL bug. It results in ANY script in cron.hourly not being executed, as long as  script from cron.daily is being executed (and takes longer than an hour). Other cross-locking issues are also caused by this bug, due to the fact that there is SINGLE lock for every type of cron (hourly, daily, weekly, monthly).

This problem is generating _many_ different and weird to debug problems because of scripts in cron NOT being run, while there is nothing wrong with them. 

Noone is attributing those problems here, because its hard to debug.  

Comment 7 Dan Wallis 2011-07-18 23:22:26 UTC
Created attachment 280333 [details, diff]
First draft of patch for per-base lockfiles

Would something like this be suitable? Am open to suggestions/comments. :)
Comment 8 Leho Kraav (:macmaN @lkraav) 2014-06-07 05:55:40 UTC
Hmm, indeed, I had a cron job unexpectedly hang on me, resulted in no cron jobs being run for two weeks until I noticed something was up as normal e-mails were not coming in anymore.

I liked the silence :) but obviously some sort of longer term dysfunctionality notification mechanism would be nice at the very least.
Comment 9 SpanKY gentoo-dev 2015-07-22 08:13:30 UTC
doing per-dir locks really changes the problem from "jobs can starve all dirs indefinitely" to "jobs can starve their own class indefinitely".  so if you screw up cron.daily, instead of blocking all of hourly/daily/weekly/monthly, you block cron.daily forever.

checking Ubuntu, they have the same issue: they have a sep job line for each cron.xxx dir (just run `run-parts` on it), so if one cron.hourly script gets hung up, then no more cron.hourly job will run.  but cron.daily and such get to keep running independently.

if anacron is in use for launching the cron.xxx jobs, it has the same issue.  no timeout is applied and they just let each category run forever.

Fedora appears to behave the same here -- they use anacron w/out timeouts.

adding per-category locks should be cheap and at least let us have parity with other distros.  we can then look at adding a timeout option where every script is run through `timeout` and with a value befitting its category.
Comment 10 SpanKY gentoo-dev 2015-07-24 05:45:13 UTC
should be all set now in the tree; thanks for the report!

Commit message: Split global lock up into one lock per /etc/cron.xxx dir
http://sources.gentoo.org/sys-process/cronbase/cronbase-0.3.7.ebuild?rev=1.1
http://sources.gentoo.org/sys-process/cronbase/files/run-crons-0.3.7?rev=1.1