Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 463822

Summary: [Future EAPI] Set 'failglob' by default for global (metadata) scope (was: 'nullglob')
Product: Gentoo Hosted Projects Reporter: Michał Górny <mgorny>
Component: PMS/EAPIAssignee: PMS/EAPI <pms>
Status: RESOLVED FIXED    
Severity: enhancement CC: esigra, kentnl
Priority: Normal    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard: in-eapi-6
Package list:
Runtime testing required: ---
Bug Depends on: 431340    
Bug Blocks: 174380    

Description Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2013-03-30 10:16:20 UTC
There were a few cases when people forgot to quote wildcard characters in pattern parameters, like:

  dev-python/foo[$(python_gen_usedep python2*)]

Things like that mostly work, unless user runs emerge from a directory where the glob matches some actual files and then hell breaks loose. And that becomes awfully hard to debug.

Those could be partially avoided if 'nullglob' was enabled by default. Then, any mistake will turn out right away, through an empty string which would be a clear syntax error rather than semi-random filenames...

The alternative is to disable globbing completely in global scope and re-enable it in phase scope.
Comment 1 Ulrich Müller gentoo-dev 2013-03-30 13:09:08 UTC
(In reply to comment #0)
> Those could be partially avoided if 'nullglob' was enabled by default. Then,
> any mistake will turn out right away, through an empty string which would be
> a clear syntax error rather than semi-random filenames...

"failglob" (in global scope only?) would fail more explicitly even.
Comment 2 Steve L 2013-03-30 20:07:34 UTC
Setting nullglob across the board in bash is a bad idea.
It breaks: unset foo[1] for one, and bash arrays are central. Try:
shopt -s nullglob; echo foo[1] # in terminal to see the issue. 
$ foo[1]=bar; unset foo[1]; echo "${foo[1]}"
bar

The workaround is: unset 'foo[1]' which is counter-intuitive, and basically makes scripting a total PITA. For this reason, I've always used setNullglob and resetNullglob functions in script (for the last 6 years or so) for the _very_ few cases where I actually want nullglob. Bear in mind any unquoted argument with a meta-character in it will be similarly hobbled, not just to unset (eg the first echo above.)

When you come down to it "a few cases when people forgot to quote wildcard characters" is basically newbs learning to script. As with any language, people should learn the basics and the idioms before assuming competence. TBH I'm surprised learning to shell-script and use BASH isn't a requirement for being a developer, given that it's the language you're supposed to be developing *in*.

failglob in global scope sounds like a better option, but it will also fall over on the above (again set it in terminal, and try: unset foo[1] - without it, there's no issue.)

So, personally I'd strongly recommend against either of these changes, unless you can guarantee eclasses aren't going to use arrays and want to cripple the scripting environment, and advise better code review of beginning scripters.

A good place to start is to require them to hang out in #bash for 6 months, as not quoting is one of the first things you get beaten out of you (eg: /msg greybot quotes) and you soon learn to double-check your script. It's equivalent to checking C return values for errors.
Comment 3 Michał Górny archtester Gentoo Infrastructure gentoo-dev Security 2013-03-30 21:40:23 UTC
(In reply to comment #2)
> Setting nullglob across the board in bash is a bad idea.
> It breaks: unset foo[1] for one, and bash arrays are central. Try:
> shopt -s nullglob; echo foo[1] # in terminal to see the issue. 
> $ foo[1]=bar; unset foo[1]; echo "${foo[1]}"
> bar

$ touch foo1
$ unset foo[1]
$ declare -p foo
declare -a foo='([0]="1" [1]="dupa")'

> When you come down to it "a few cases when people forgot to quote wildcard
> characters" is basically newbs learning to script. As with any language,
> people should learn the basics and the idioms before assuming competence.

I think you should think twice before writing something like this since you can offend someone. And then someone may point that you're actually failing hard at bash, and your scripts stops to work properly when a file with semi-random name exists...
Comment 4 Ulrich Müller gentoo-dev 2013-03-30 22:51:18 UTC
Right, unset is only a builtin command, not a language keyword. So its parameters must be quoted to prevent expansion.

I'd consider "unset foo[1]" an error. However, even the Advanced Bash Scripting Guide gets it wrong: http://tldp.org/LDP/abs/html/arrays.html#EX67
Comment 5 SpanKY gentoo-dev 2013-03-31 00:54:22 UTC
there are some cases where ebuilds glob paths that might not exist (and that's ok).  but we should be able to easily turn this behavior off, so that's ok:
  eshopts_push -u failglob
  ...
  eshopts_pop
Comment 6 SpanKY gentoo-dev 2013-04-02 03:55:59 UTC
using just for global scope shouldn't run into any problems.  but i don't see a reason to not make it the default for ebuilds too with newer EAPIs.  it's much more uncommon for people to accidentally glob than for them to want to support a failed glob.
Comment 7 Steve L 2013-04-28 06:41:03 UTC
(In reply to comment #3)
> I think you should think twice before writing something like this since you
> can offend someone. And then someone may point that you're actually failing
> hard at bash, and your scripts stops to work properly when a file with semi-
> random name exists...

Sorry you were so evidently offended. As for "files with semi-random names", I don't use those, and in fact my scripts run from defined directories.
The pragmatic convenience of using unset as a normal part of the language, far outweighs the inability to use files with the exact same names as my array variables, with a number tacked on to the end.
Somehow that's never been a problem, whereas losing unset most definitely was. (And believe me I did use unset 'foo[i]' slavishly for a while in functions; it's a total PITA if you do anything with arrays, and no, temporary on-off settings don't cut it: they end up annoying you even more. You think you don't like ${D%/}? You are /not/ going to like writing eclasses with the setup you're proposing.)

But the same logic then applies to people writing scripts who don't know about quoting, which is far more basic, don't you agree? After all, messing up your quoting *will* lead to problems for file and directory names, as well as every other parameter expansion passed to a command. And just like the case you outlined, it's more liable to blow-up at the user end, not in your limited testing: which is why it gets beaten into you in #bash.

So if your developers really are messing up quoting, by _your_ own measure that's failing *really* hard.

Don't you think it would be better to educate them, and test them more thoroughly, before they get called a "developer", rather than cripple the scripting environment for everyone, to shield a few from learning from their mistakes? Call me old-fashioned, but I'd expect more pride in the craft than this.

Good luck with it anyhow.

PS: ABS is utter rubbish. Here's some better docs, wooledge.org in particular:
man bash | http://wiki.bash-hackers.org/doku.php?id=scripting:basics |
http://mywiki.wooledge.org/BashGuide | http://www.grymoire.com/Unix/index.html |
http://mywiki.wooledge.org/BashFAQ | http://mywiki.wooledge.org/BashPitfalls |
http://www.shelldorado.com/
Comment 8 Kent Fredric (IRC: kent\n) (RETIRED) gentoo-dev 2014-01-13 17:45:24 UTC
Maybe instead of it being an EAPI feature, it could be a repoman QA check?

Though the problem is probably applicable for eclasses too,and I don't believe we have any QA checking tools for those, but we should have.

Though I guess it could be annoying and provide extra noise sources for repoman users unless it was combined with a comment-controlled system to selectively disable selective repoman QA checks for limited scopes

Similar to Perl::Critic , which has magic comments like:

> # SomePolicyName in effect here
> sub foo {
>   # SomePolicyName in effect here
>   # Following comment with double-hash disables application of SomePolicyName
>
>   ## no critic (SomePolicyName)
>
>   # SomePolicyName not in effect here
> }
> # SomePolicyName in efffect here due to the lexical scope of foo { } ending

This makes it clear to intent with disabling application of certain policies, and makes the scope for disabling policies limited to reduce damage at users discretion, and also later adds markers for extended QA if QA decides policy avoidance is to be forbidden.

And SomePolicyName here expands to a module name, of which many are provided stock, and users can install and enable custom ones.

Would /love/ to have a comparable system for repoman, where policy additions are presently both hard-coded, and global.
Comment 10 Ulrich Müller gentoo-dev 2015-11-08 21:39:51 UTC
(In reply to Ulrich Müller from comment #9)

Sorry, that was on the wrong branch. Correct URI is:
https://gitweb.gentoo.org/proj/pms.git/commit/?id=5e5b02033ff3f2a8fa0783668c11960277aa68bd