Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 415689 - virtual/awk: Virtual for awk implementation
Summary: virtual/awk: Virtual for awk implementation
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: [OLD] Core system (show other bugs)
Hardware: All Linux
: Normal normal (vote)
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks: 418473
  Show dependency tree
 
Reported: 2012-05-13 04:10 UTC by Christoph Junghans (RETIRED)
Modified: 2012-06-04 03:33 UTC (History)
2 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christoph Junghans (RETIRED) gentoo-dev 2012-05-13 04:10:35 UTC
We have three awk implementations in gx86:
gawk, mawk and busybox

Some scientific package would profit from that.

We would need:
-virtual/awk
-app-admin/eselect-awk

I have added everything for testing in cj-overlay.
Comment 1 SpanKY gentoo-dev 2012-05-13 17:47:53 UTC
busybox doesn't provide an awk symlink, and our system requires gawk.  so what exactly would virtual/awk accomplish when gawk is always going to be installed ?
Comment 2 Christoph Junghans (RETIRED) gentoo-dev 2012-05-13 19:34:19 UTC
(In reply to comment #1)
> busybox doesn't provide an awk symlink, and our system requires gawk.  
symlink or not, "busybox awk" is an awk implementation. I have tested it with some of my science scripts.

> so what exactly would virtual/awk accomplish when gawk is always going
> to be installed ?
mawk is significantly faster than gawk. busybox would save you the install of gawk of small systems. virtual/awk would ensure that at least one of them is installed.

The 3 calls in misc-functions.sh of portage should work with any awk interpreter.
In the eclasses mostly '{print $X}' is used, works with any awk interpreter.
Comment 3 SpanKY gentoo-dev 2012-05-13 21:29:44 UTC
awk is used in a lot of places, not just portage.  just grep for it in /usr/bin.

further, changing `awk` from gawk without consulting gentoo-dev is a no go.  it is used heavily in the tree, and is what everyone develops with.  so changing it implies a new support burden that everyone has to buy into.

the fact that it's faster really doesn't matter for most of our things as it rarely gets called.
Comment 4 Christoph Junghans (RETIRED) gentoo-dev 2012-05-15 00:33:04 UTC
(In reply to comment #3)
> awk is used in a lot of places, not just portage.  just grep for it in
> /usr/bin.
Yes, but I haven't had any problems with mawk as interpreter so far. And as long as no secret GNU extensions are used, mawk will understand most of the GNU thing as well. For nawk, this is a different question. 

> further, changing `awk` from gawk without consulting gentoo-dev is a no go. 
> it is used heavily in the tree, and is what everyone develops with.  so
> changing it implies a new support burden that everyone has to buy into.
True, this bug is preliminary work and thought collection prior a RFC on gentoo-dev.

>
> the fact that it's faster really doesn't matter for most of our things as it
> rarely gets called.
We could also add a mawk use flag to the sci package, which make extensive use of awk, and would profit from its speedup.  But the solution don't seem very nice to me.

Actually I came across this idea as Ubuntu ships mawk instead of gawk and I had to fix some awk code in one of my packages.
Comment 5 SpanKY gentoo-dev 2012-05-15 03:09:09 UTC
(In reply to comment #4)

mawk supports POSIX, not GNU extensions.  thus it isn't hard to locate the stuff that mawk doesn't support but is used.  but that isn't the only lurking issue that causes support problems ... there are known edge cases where mawk works one way and gawk works another.  like substr() and an index of 0.

it can probably be made to work by requiring gawk and encouraging people to use `gawk` in the ebuilds, and to only use `awk` if they actually care about it.  in the default install, only gawk would be installed (since nothing would pull in mawk).  this would allow mawk to be installed and manage `awk` w/out affecting the ebuild environment.
Comment 6 Christoph Junghans (RETIRED) gentoo-dev 2012-05-16 23:14:21 UTC
(In reply to comment #5)
> mawk supports POSIX, not GNU extensions.  thus it isn't hard to locate the
> stuff that mawk doesn't support but is used.  but that isn't the only
> lurking issue that causes support problems ... there are known edge cases
> where mawk works one way and gawk works another.  like substr() and an index
> of 0.
mawk know slightly more that POSIX, but this is all in the undocumented gray zone.

> it can probably be made to work by requiring gawk and encouraging people to
> use `gawk` in the ebuilds, and to only use `awk` if they actually care about
> it.  in the default install, only gawk would be installed (since nothing
> would pull in mawk).  this would allow mawk to be installed and manage `awk`
> w/out affecting the ebuild environment.
Sounds good, CCing sci to hear some more opinions.
Comment 7 Jonathan Callen (RETIRED) gentoo-dev 2012-05-30 21:33:01 UTC
Last time I checked, sys-libs/glibc fails to build unless `awk` points to `gawk`.  Glibc explicitly runs `awk` during its build and requires that certain GNU extensions not provided by busybox or mawk be present in awk.  Also (at least at that time), glibc upstream refused patches to call `gawk` instead.
Comment 8 SpanKY gentoo-dev 2012-05-31 16:22:57 UTC
(In reply to comment #7)

i'll sort glibc out if that's still true
Comment 9 Christoph Junghans (RETIRED) gentoo-dev 2012-06-01 04:13:46 UTC
To avoid flooding with comments of incompatibility of !gawk, I created bug #418473 to keep track of packages broken by the usage of virtual/awk instead of sys-apps/gawk.