Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 138792 - dobin etc. should automatically die on failure
Summary: dobin etc. should automatically die on failure
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Core - Ebuild Support (show other bugs)
Hardware: All Linux
: High enhancement (vote)
Assignee: PMS/EAPI
URL:
Whiteboard: in-eapi-4
Keywords:
: 211000 224279 (view as bug list)
Depends on:
Blocks: future-eapi
  Show dependency tree
 
Reported: 2006-07-01 17:50 UTC by Paul Bredbury
Modified: 2010-12-30 19:53 UTC (History)
12 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
ebuild.sh.diff (ebuild.sh.diff,3.02 KB, patch)
2006-07-01 17:51 UTC, Paul Bredbury
Details | Diff
ebuild.sh.doexe.eerror.patch (ebuild.sh.doexe.eerror.patch,3.34 KB, text/plain)
2006-09-01 05:46 UTC, Stefan Schweizer (RETIRED)
Details
traps.bash (traps.bash,541 bytes, text/plain)
2008-09-09 19:52 UTC, David Leverton
Details
trap example from bug 211000, with && and || demonstrations (trap_err.sh,193 bytes, text/plain)
2008-11-05 22:18 UTC, Zac Medico
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Bredbury 2006-07-01 17:50:37 UTC
Hi, I propose that the following ebuild commands themselves *die* on failure, because the vast majority of the time the emerge might as well be considered a failure if such commands fail.

 dobin newbin  
dosbin newsbin
 doexe newexe

Other potentials for this hard-line treatment are dolib{,.a,.so} - maybe even doins.

Enclosed is a patch to add the functions to ebuild.sh, replacing the dobin etc. bash files. Unfortunately, it may cause some badly-written ebuilds to break :)

Ebuilds in the current Portage tree sometimes include "|| die", and sometimes don't, which is a bad lack of standardization.
Comment 1 Paul Bredbury 2006-07-01 17:51:21 UTC
Created attachment 90655 [details, diff]
ebuild.sh.diff
Comment 2 Alec Warner (RETIRED) archtester gentoo-dev Security 2006-07-06 18:54:23 UTC
Enforcing this in ebuild.sh is not appropriate.  If you want ebuild standards, talk to the dev community; I'm sure ebuild developers can list of dozens of examples where having these functions die automagically is a bad idea.  If it is a failure condition the ebuild should enforce it, not portage.
Comment 3 Daniel Black (RETIRED) gentoo-dev 2006-07-07 01:10:56 UTC
(In reply to comment #2)
> Enforcing this in ebuild.sh is not appropriate.  If you want ebuild standards,
> talk to the dev community;

Paul did seek advice from a number of devs in #gentoo-dev-help. This wasn't expressed here however it did occur to the extent that this bug was lodged with the support of some devs and the disapproval of none. If further input is needed I suggest referring the gaining of concensus to the gentoo-dev email list before resolving as a WONTFIX.

> I'm sure ebuild developers can list of dozens of
> examples where having these functions die automagically is a bad idea.

If I do a doexe ... in an ebuild I expect the file to exist and for it to be installed otherwise I wouldn't have put the statement in. If the file may not exist I'd do "[ -f xxx ] && doexe xxx".

I can't think of a single good reason to not die automatically. Maybe others can, but I can't.

I think a better way do disapprove of a bug is to actually state a good example where it isn't appropriate rather than loose statements starting with "I'm sure [other person(s)] [expresses opinion x]".

>  If it
> is a failure condition the ebuild should enforce it, not portage.

like unpack?

It seems like Marius has fixed most, not sure about all, the do* functions. Thanks Marius.
Comment 4 Zac Medico gentoo-dev 2006-07-07 02:14:49 UTC
I'm the type that likes plenty of feedback when something goes wrong and I've always thought that it is annoying to have lots of checks for return values in shell code.  A few months ago I was approached by betelgeuse with a similar idea.  I thought it was a nice idea then, and I still do now.
Comment 5 Jason Stubbs (RETIRED) gentoo-dev 2006-07-07 21:21:12 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > I'm sure ebuild developers can list of dozens of
> > examples where having these functions die automagically is a bad idea.
> 
> If I do a doexe ... in an ebuild I expect the file to exist and for it to be
> installed otherwise I wouldn't have put the statement in. If the file may not
> exist I'd do "[ -f xxx ] && doexe xxx".
>
> I can't think of a single good reason to not die automatically. Maybe others
> can, but I can't.

Consistency. While I can't think of a case where do*{bin,exe} should fail, there are many people against enforcing dying in the almost identical doman and dodoc. If you can convince them that "[ -f xxx ] && dodoc xxx" is a good thing (which I think it is) I'd be in support of this change in all do* functions.
Comment 6 Paul Bredbury 2006-07-08 02:32:36 UTC
As a compromise, dodoc could just ewarn if the file doesn't exist, rather than die.
Comment 7 Stefan Schweizer (RETIRED) gentoo-dev 2006-07-08 02:51:10 UTC
The difference is that documentation and manuals are not vital for the package to work. I suggest for dodoc and doman:
QA Notice: File 'haha' does not exist for dodoc

Dieing is only needed for dobin and doexe, to ensure that all executables are installed correctly and the package works as expected.
Comment 8 Jason Stubbs (RETIRED) gentoo-dev 2006-07-08 03:47:12 UTC
(In reply to comment #7)
> The difference is that documentation and manuals are not vital for the package
> to work.

That really depends on the application. A quick example would be most of the "applications" in app-doc/ which specifically use "do* || die".

My point is that whatever is done, consistency should be the main aim. Right now, econf will die if configure fails but it is encouraged that ebuilds specify the "|| die". emake on the other hand doesn't die. Depending on the package, the success of emake is just as important as that of dobin or dodoc. 

It seems that originally it was a case of "power to the people" in that there was no automatic dying(sp?) on failures in ebuild support commands. It's slowly moving to a case of what some might consider hand-holding. I personally view it as enforced explicitness. If every ebuild support command dies on failure, an ebuild author has to explicitly check and allow for error conditions that are okay.

Whether you view executables to be more important than documentation, symlinks, libraries or the host of other types of files of which there are helper functions/scripts is pretty much irrelevant here IMHO. Why was it that you forgot to include the "|| die"? Could it be that it was because you knew that econf and some other helper functions will automatically die for you? Iow, was it a lack of inconsistency that led you to the erroneous omission?

Changing the behaviour is going to break packages (in stable mind you) that were working before. If the behaviour is really going to be changed, do it all in one sweep so as to keep the number of occurences low and raise awareness but most importantly make it consistent so that it isn't so easy to become confused.
Comment 9 Jason Stubbs (RETIRED) gentoo-dev 2006-07-08 03:53:17 UTC
(In reply to comment #8)
> it a lack of inconsistency that led you to the erroneous omission?

s/inconsistency/consistency/

Thinking about this, the other camp would say that there is already perfect consistency. While portage does die on some error conditions in helper scripts, some would say that there is no guarantee that it will die for you. Following that line of thought, it is already completely consistent in that a developer must specifically state every error condition that should cause failure - remember that, even though econf will die for you, policy still states econf should be followed by "|| die".

Either way, make/keep it consistent.
Comment 10 SpanKY gentoo-dev 2006-07-08 06:54:24 UTC
i'm really with Jason on this one

the thing that irks me about the econf die is that there's no way for me to customize the die message ... most of the time i dont want to, but sometimes i use econf more than once in an ebuild so i want a different die message for each one, but i cant since portage calls die for me

the do* funcs however do allow me to go ahead and change the die messages since my version is the one being called, not portage's
Comment 11 Paul Bredbury 2006-07-09 04:27:24 UTC
This is not "hand-holding", it is normal error-handling as in *other* procedural languages which weren't designed such a long time ago. Bash is stuck in the dark ages with the "An error occurred? Who cares?" approach that is evident in bash code. Safeguards can and should be written into the code, with if..then blocks.

Consistency is obviously desirable, but *sensible behaviour* is much more important. There is already inconsistency, since "unpack" dies and "dodoc" doesn't.

It's impractical to make all the do* commands die on failure, and thus break a load of sloppily-written ebuilds immediately. What *is* practical is to phase in this approach, starting with the executables. dodoc should "ewarn" or "eerror" rather than "echo" (and have its error message scroll up the screen, out of sight, in the blink of an eye).

What's missing from this discussion is a single example of a doexe/dobin which could reasonably fail, without it being either the result of a sloppily-written ebuild, or a situation in which the emerge should fail anyway. Until someone can provide such an example, then it's pretty clear that the default should be to die, because currently a lot of ebuilds *don't* die themselves - now that *is* sloppy.


A custom message on dying can easily be implemented, by adding these lines to the top of diefunc() in ebuild.sh:

	if [[ -n "${DIE_MSG}" ]] ; then
		echo "!!! ${DIE_MSG}"
	fi

Then, in an ebuild, you could use:

DIE_MSG="your custom die message here" doexe foo
Comment 12 Simon Stelling (RETIRED) gentoo-dev 2006-07-09 10:14:17 UTC
(In reply to comment #11)

Paul, I wholeheartedly agree with you. "Consistency" is cool and such, but it doesn't make sense at all to assume that a failing doexe is not a very good reason to fail.
Comment 13 Zac Medico gentoo-dev 2006-07-13 15:24:54 UTC
In order to make this migration smooth, we need to update the patch to use eerror instead of die until we're confident that most ebuilds will behave.  The elog framework will make it easy for users to collect the necessary data and file bugs.
Comment 14 Zac Medico gentoo-dev 2006-07-13 15:28:15 UTC
I forgot to mention that the eerror call should use BASH_SOURCE and BASH_LINENO to provide the exact source file and line number of the error (similar to the dump_trace function in ebuild.sh).
Comment 15 Stefan Schweizer (RETIRED) gentoo-dev 2006-09-01 05:46:09 UTC
Created attachment 95641 [details]
ebuild.sh.doexe.eerror.patch

errors instead of dying - with call trace.
Comment 16 Peter Volkov (RETIRED) gentoo-dev 2006-09-22 03:31:18 UTC
Is there any resolution here?

I've reread thread on -dev and this bug but still could not find what shall I do.
If I understood sources correctly currenly dobin, dodoc, doman, dosbin and dolib print error message if files does not exist. But some developers still think that do* implementation should do nothing for nonexistent files.

Personally I like all this functions to die in future, but Is it what we all decided on? When?
Comment 17 Zac Medico gentoo-dev 2006-09-22 10:57:33 UTC
Like Jason, I think that consistency should be a major goal of a migration such as this.  This type of change has such a large impact on the tree that I think it should be made into a GLEP.  It's possible that we could make the strict behavior optional via a magic environment variable.  If the strict behavior isn't optional, then it should probably be part of an EAPI bump.  Other than the the strict behavior, we also need to keep in mind that helper functions (as opposed to helper scripts) are incompatible with xargs.
Comment 18 Zac Medico gentoo-dev 2008-02-21 17:21:18 UTC
*** Bug 211000 has been marked as a duplicate of this bug. ***
Comment 19 Ranjit Singh 2008-06-13 01:23:08 UTC
(In reply to comment #17)
> Like Jason, I think that consistency should be a major goal of a migration such
> as this.  This type of change has such a large impact on the tree that I think
> it should be made into a GLEP.  It's possible that we could make the strict
> behavior optional via a magic environment variable.  If the strict behavior
> isn't optional, then it should probably be part of an EAPI bump.

How about adding an --or-die optional parameter (has to go first)? (Logically, could use -x or anything else.)

>  Other than
> the the strict behavior, we also need to keep in mind that helper functions (as
> opposed to helper scripts) are incompatible with xargs.
> 
That's a bit more of a pain (same for find .. -exec.) I do think there's a good case for making these helper "scripts" into internal functions, since most of them are simple one-liners. Can we not simply provide both? That way the internal function would get called unless it was from a find (or some other xargs) invocation, with consequent speed-up.

wrt skipping non-existent files, that makes sense if it's coming from an eclass; the relevant var could be overwritten by the ebuild dev, if it's not suitable, but it sounds like more maintenance headache to me.
Comment 20 Peter Volkov (RETIRED) gentoo-dev 2008-09-09 14:23:19 UTC
*** Bug 224279 has been marked as a duplicate of this bug. ***
Comment 21 Zac Medico gentoo-dev 2008-09-09 16:15:19 UTC
It seems to me that the most consistent way to implement this sort of behavior would be to use a trap as discussed in bug 211000.
Comment 22 David Leverton 2008-09-09 19:52:46 UTC
Created attachment 165052 [details]
traps.bash

(In reply to comment #21)
> It seems to me that the most consistent way to implement this sort of behavior
> would be to use a trap as discussed in bug 211000.
> 

As much as I'd like to do that, it doesn't seem to be possible in current bash.  Running the attached script (a silly extension of one you posted on another bug) demonstrates that testing the return value of a function suppresses the trap throughout execution of that function, even in commands that don't contribute to the return value (I also seem to recall the trap interacting badly with loops, but I can't reproduce that at the moment).

I added set -E as per
    `-E'
          If set, any trap on `ERR' is inherited by shell functions,
          command substitutions, and commands executed in a subshell
          environment.  The `ERR' trap is normally not inherited in
          such cases.
from the bash manually, but although it causes the trap to work when the return value is /not/ being tested, it doesn't make any difference when it is.

Last time we mentioned this you suggested contacting bash upstream and seeing if they'll fix it.  If you think you we persuade them to do so, that would be great, but for now, using trap isn't a viable option.
Comment 23 Zac Medico gentoo-dev 2008-11-05 20:16:02 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > It seems to me that the most consistent way to implement this sort of behavior
> > would be to use a trap as discussed in bug 211000.
> > 
> 
> As much as I'd like to do that, it doesn't seem to be possible in current bash.

Well, I think we should see about getting any necessary changes to bash merged upstream.
Comment 24 Ciaran McCreesh 2008-11-05 20:35:34 UTC
It's not a question of bash functionality. The issue is, there's no way to tell whether or not something of the form 'foo && bar' failing is an error.
Comment 25 Thomas Sachau gentoo-dev 2008-11-05 21:10:45 UTC
(In reply to comment #24)
> It's not a question of bash functionality. The issue is, there's no way to tell
> whether or not something of the form 'foo && bar' failing is an error.
> 

Any suggestion to get around this?

(In reply to comment #22)
> Last time we mentioned this you suggested contacting bash upstream and seeing
> if they'll fix it.  If you think you we persuade them to do so, that would be
> great, but for now, using trap isn't a viable option.
> 

Could you talk to upstream about it? zmedico told me on IRC that he is quite busy and probably wont get to it.
Comment 26 Ciaran McCreesh 2008-11-05 21:17:36 UTC
(In reply to comment #25)
> (In reply to comment #24)
> > It's not a question of bash functionality. The issue is, there's no way to tell
> > whether or not something of the form 'foo && bar' failing is an error.
> 
> Any suggestion to get around this?

Error traps are fundamentally incompatible with any serious coding style, and just aren't designed to do what we're after. The solution is to provide a way for external utilities to make the ebuild process die, and then convert every utility to use that in a future EAPI.

There's a working implementation in Paludis for exheres-0 handling (complete with fancy 'nonfatal' function that lets you do 'nonfatal dobin mightnotwork'). In practice it's rather nice.
Comment 27 Zac Medico gentoo-dev 2008-11-05 22:18:33 UTC
Created attachment 170853 [details]
trap example from bug 211000, with && and || demonstrations

(In reply to comment #24)
> It's not a question of bash functionality. The issue is, there's no way to tell
> whether or not something of the form 'foo && bar' failing is an error.

Well it seems to work fine with && and || in the attached sample code.
Comment 28 Ciaran McCreesh 2008-11-05 22:29:35 UTC
false || ( false )
false || { false ; }
false | xargs true
true | xargs false

and so on.
Comment 29 Zac Medico gentoo-dev 2008-11-05 22:49:29 UTC
(In reply to comment #28)
> false || ( false )

This case works if die() has subshell support.

> false || { false ; }

This case just works.

> false | xargs true
> true | xargs false

This case would require die support built into the commands called by xargs, which is fine if we choose to implement it.
Comment 30 Zac Medico gentoo-dev 2008-11-05 22:58:54 UTC
(In reply to comment #29)
> > false | xargs true
> > true | xargs false
> 
> This case would require die support built into the commands called by xargs,
> which is fine if we choose to implement it.

Err, nevermind. Those cases seem to use the return value from the last command in the pipeline. Isn't that okay?
Comment 31 Ciaran McCreesh 2008-11-05 23:06:25 UTC
(In reply to comment #29)
> (In reply to comment #28)
> > false || ( false )
> 
> This case works if die() has subshell support.

It doesn't, though. It doesn't trap.

> > false || { false ; }
> 
> This case just works.

This is the contrast to ( false ). One traps, one doesn't. Developers don't know this, and they have even less chance of knowing what happens when the subshell isn't so obvious.

> > false | xargs true
> > true | xargs false
> 
> This case would require die support built into the commands called by xargs,
> which is fine if we choose to implement it.

So we're back to requiring commands to die on failure anyway...
Comment 32 Zac Medico gentoo-dev 2008-11-05 23:24:55 UTC
(In reply to comment #31)
> This is the contrast to ( false ). One traps, one doesn't. Developers don't
> know this, and they have even less chance of knowing what happens when the
> subshell isn't so obvious.

Hmm, that's true. Perhaps we should use a combination of both approaches, so we can handle cases like the missing epatch call in bug #211000.
Comment 33 Ciaran McCreesh 2008-11-05 23:40:37 UTC
A combination is excessively confusing.

There are cases where failure has to be handled with something other than a die. With exheres-0, this is done using nonfatal:

    nonfatal emake blah

Sometimes we want to use this in a conditional, so nonfatal still returns non-zero for a failure:

    if ! nonfatal emake check ; then

But if we also die on ERR, we'd have to write:

    nonfatal emake blah || true

to indicate both to emake and to bash that we don't want an error to be fatal.

Even then, using fatal utilities to work around weirdness in how traps work only solves the problem for utilities, not other programs. So the choice is between using both and still getting weird behaviour, or *just* using fatal utilities and having an easy set of rules for developers to remember.
Comment 34 Ulrich Müller gentoo-dev 2010-12-30 19:53:07 UTC
EAPI 4 has been approved by the council.