Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 280598 - app-admin/eselect-1.0.12: basename function is not POSIX compliant
Summary: app-admin/eselect-1.0.12: basename function is not POSIX compliant
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Hosted Projects
Classification: Unclassified
Component: eselect (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo eselect Team
URL:
Whiteboard:
Keywords: InVCS
Depends on:
Blocks:
 
Reported: 2009-08-06 20:08 UTC by Noah Sheppard
Modified: 2009-08-19 22:28 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments
New implementation of the basename() function which handles trailing slashes (path-manipulation-fix_basename.diff,519 bytes, patch)
2009-08-07 13:36 UTC, Noah Sheppard
Details | Diff
POSIX-compliant basename and dirname implementations in bash (path-manipulation-fix_basename_dirname.bash,2.13 KB, patch)
2009-08-11 19:59 UTC, Noah Sheppard
Details | Diff
POSIX compliant basename and dirname (without loops) (basename.bash,861 bytes, text/plain)
2009-08-12 10:56 UTC, Ulrich Müller
Details
basename and dirname, including unit tests (basename.bash,1.75 KB, text/plain)
2009-08-13 19:01 UTC, Ulrich Müller
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Noah Sheppard 2009-08-06 20:08:21 UTC
The basename function provided in the path-manipulation.bash eselect library does not behave properly as compared to the coreutils basename.  When the argument contains trailing slashes, it returns a blank string rather than the last path component after trailing slashes are stripped, as the coreutils basename does.

Steps to reproduce:
1-At a bash prompt, source the path-manipulation.bash eselect library (source /usr/share/eselect/path-manipulation.bash).  'basename' now refers to the eselect basename function.
2-Run 'basename foobar/'.  For comparison, run '/usr/bin/basename foobar/' to see what coreutils' basename does.

Reproducible: 100%

What happens:
eselect's basename outputs nothing except a blank line.

What should happen:
eselect's basename should output foobar, just as coreutil's basename does.
Comment 1 Noah Sheppard 2009-08-07 01:03:57 UTC
Sorry, I neglected to give some other important info.

This is app-admin/eselect-1.0.12.

emerge --info:
Portage 2.1.6.13 (default/linux/amd64/2008.0/desktop, gcc-4.3.2, glibc-2.9_p20081
201-r2, 2.6.29-gentoo-r5 x86_64)
=================================================================
System uname: Linux-2.6.29-gentoo-r5-x86_64-Intel-R-_Core-TM-2_Quad_CPU_Q6600_@_2
.40GHz-with-glibc2.2.5
Timestamp of tree: Fri, 24 Jul 2009 16:45:01 +0000
ccache version 2.4 [enabled]
app-shells/bash:     3.2_p39
dev-java/java-config: 1.3.7-r1, 2.1.8-r1
dev-lang/python:     2.5.4-r3
dev-util/ccache:     2.4-r7
dev-util/cmake:      2.6.4
sys-apps/baselayout: 1.12.11.1
sys-apps/sandbox:    1.6-r2
sys-devel/autoconf:  2.13, 2.63
sys-devel/automake:  1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2
sys-devel/binutils:  2.18-r3
sys-devel/gcc-config: 1.4.1
sys-devel/libtool:   1.5.26
virtual/os-headers:  2.6.27-r2
ACCEPT_KEYWORDS="amd64"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-O2 -pipe -fomit-frame-pointer"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shut
down /usr/share/config /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/f
onts/fonts.conf /etc/gconf /etc/php/apache2-php5/ext-active/ /etc/php/cgi-php5/ex
t-active/ /etc/php/cli-php5/ext-active/ /etc/revdep-rebuild /etc/sandbox.d /etc/t
erminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /
etc/texmf/web2c /etc/udev/rules.d"
CXXFLAGS="-O2 -pipe -fomit-frame-pointer"
DISTDIR="/exclude/distfiles"
FEATURES="ccache distlocks fixpackages parallel-fetch prelink protect-owned sandb
ox sfperms strict unmerge-orphans userfetch"
GENTOO_MIRRORS="ftp://gentoo.mirrors.tds.net/gentoo http://gentoo.mirrors.tds.net
/gentoo/ ftp://ftp.ussg.iu.edu/pub/linux/gentoo"
LDFLAGS="-Wl,-O1"
MAKEOPTS="-j5"
PKGDIR="/exclude/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress -
-force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude
=/local --exclude=/packages"
PORTAGE_TMPDIR="/exclude/port-tmp"
PORTDIR="/exclude/portage"
PORTDIR_OVERLAY="/exclude/overlays/layman/java-overlay /exclude/overlays/tucse"
SYNC="rsync://rsync.namerica.gentoo.org/gentoo-portage"
USE="X acl acpi alsa amd64 arts audiofile bash-completion berkdb bluetooth bzip2 
cairo cdparanoia cdr cli cracklib crypt cups curl dbus dri dvd dvdr dvdread eds e
macs emboss encode esd evo exif fam ffmpeg firefox flac fortran ftp gcj gd gdbm g
if gnome gphoto2 gpm gstreamer gtk hal iconv ieee1394 isdnlog java javascript jpe
g kde lame latex ldap libnotify mad midi mikmod mmx mp3 mpeg mplayer mudflap mult
ilib mysql ncurses nls nptl nptlonly nsplugin ogg opengl openmp pam pcre pdf perl
 png ppds pppd python qt3 qt3support qt4 quicktime readline reflection ruby samba
 scanner sdl session slang sndfile spell spl sqlite sse sse2 ssl startup-notifica
tion svg symlink sysfs tcpd tiff tk truetype unicode usb vim-syntax vorbis wxwind
ows xine xinerama xml xorg xscreensaver xulrunner xv zlib" ALSA_CARDS="ali5451 al
s4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1
968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx
-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty ex
tplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi n
ull plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic 
authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default
 authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav
_lock deflate dir disk_cache env expires ext_filter file_cache filter headers inc
lude info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif
 speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVI
CES="keyboard mouse" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44
780 lb216 lcdm001 mtxorb ncurses text" USERLAND="GNU" VIDEO_CARDS="nvidia nv vesa
 fbdev"
Unset:  CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, LANG, LC_AL
L, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
Comment 2 Noah Sheppard 2009-08-07 13:36:19 UTC
Created attachment 200496 [details, diff]
New implementation of the basename() function which handles trailing slashes
Comment 3 Noah Sheppard 2009-08-07 13:38:22 UTC
eselect-1.1.1 does the same thing.  The patch I just added is against eselect-1.1.1's path-manipulation.bash.in.
Comment 4 Ulrich Müller gentoo-dev 2009-08-07 16:18:19 UTC
(In reply to comment #2)
> Created an attachment (id=200496) [edit]
> New implementation of the basename() function which handles trailing slashes

This is also not compliant with the POSIX specification: <http://www.opengroup.org/onlinepubs/009695399/utilities/basename.html>
For example, it fails if the argument is a single slash.

But I guess that eselect's basename function was never intended to be POSIX compliant. It's just a cheap way to have ${1##*/} available for command substitution.
Comment 5 Noah Sheppard 2009-08-11 19:58:38 UTC
(In reply to comment #4)
> (In reply to comment #2)
> > Created an attachment (id=200496) [edit]
> > New implementation of the basename() function which handles trailing slashes
> 
> This is also not compliant with the POSIX specification:
> <http://www.opengroup.org/onlinepubs/009695399/utilities/basename.html>
> For example, it fails if the argument is a single slash.

Quite right.  I'll attach a patch with a posix-compliant basename (and dirname as well, since the existing dirname is also not posix compliant).  I don't claim to be a wizard at bash coding, so there may be a less verbose way than what I've written.  If so, I'd love to learn.

> But I guess that eselect's basename function was never intended to be POSIX
> compliant. It's just a cheap way to have ${1##*/} available for command
> substitution.

What is the point of a basename which is not really basename?  Why call it basename if it is not going to return the actual basename of a path?
Comment 6 Noah Sheppard 2009-08-11 19:59:34 UTC
Created attachment 200963 [details, diff]
POSIX-compliant basename and dirname implementations in bash
Comment 7 Ulrich Müller gentoo-dev 2009-08-12 09:12:11 UTC
(In reply to comment #5)
> I'll attach a patch with a posix-compliant basename (and dirname as well,
> since the existing dirname is also not posix compliant).

Thanks.

> I don't claim to be a wizard at bash coding, so there may be a less verbose
> way than what I've written.  If so, I'd love to learn.

I'll just pick a few examples that caught my eye:

Substring expansion already enables arithmetic evaluation, so you don't need the $(( )) in this expression:
  [[ ${pstr:$(( ${#pstr} - 1 ))} == '/' && ${#pstr} > 1 ]]
But it can be written even shorter by using a negative offset, and the second condition can also be shortened:
  [[ ${pstr: -1} = / && ${pstr} != / ]]

In dirname:
  [[ ${#pstr} == 0 ]]
Don't use "==" or ">" for arithmetic comparison, use "-eq" or "-gt". And why not write the condition as:
  [[ -z ${pstr} ]]

The following can be replaced by regular expresion matching:
  str_has_dirsep ${pstr}
  if [[ $? == 0 ]]; then
namely:
  if [[ ${pstr} == */* ]]; then

> What is the point of a basename which is not really basename?  Why call it
> basename if it is not going to return the actual basename of a path?

Actually, that's not so uncommon. Even bash itself comes with examples that don't comply with the POSIX spec (see the examples/functions/ directory in the bash distribution). And the GNU version of basename(3) behaves like what we have in eselect.

Obviously you have a point if you say that the functions should behave like their command equivalents. However, I fear that's it's too late to change, since basename and dirname in their current implementation are used throughout the eselect code. I'm not so much concerned about eselect proper, but about the ~30 external eselect modules. If we change the way how the functions work, breakage may result.

I have to think about it. AFAICS, possible options are:
  1. Implement the POSIX compliant versions of the functions.
  2. Remove the functions completely, so that the real commands are used.
     May be less efficient.
  3. Keep things as they are, but document it in developer-guide.txt.
Comment 8 Ulrich Müller gentoo-dev 2009-08-12 10:56:54 UTC
Created attachment 201015 [details]
POSIX compliant basename and dirname (without loops)

I think for efficiency reasons any loops should be avoided. Can you try if attached implementation catches all cases?
Comment 9 Noah Sheppard 2009-08-13 18:14:32 UTC
(In reply to comment #7)
> I'll just pick a few examples that caught my eye:

Thanks.

> Actually, that's not so uncommon. Even bash itself comes with examples that
> don't comply with the POSIX spec (see the examples/functions/ directory in the
> bash distribution). And the GNU version of basename(3) behaves like what we
> have in eselect.

I see.

> However, I fear that's it's too late to change,
> since basename and dirname in their current implementation are used throughout
> the eselect code. I'm not so much concerned about eselect proper, but about 
> the ~30 external eselect modules. If we change the way how the functions work,
> breakage may result.

Yes, I was afraid of something like that.

> I have to think about it. AFAICS, possible options are:
>   1. Implement the POSIX compliant versions of the functions.
>   2. Remove the functions completely, so that the real commands are used.
>      May be less efficient.
Aren't 1 & 2 basically the same thing, with potentially a bit less efficiency in 2?

(In reply to comment #8)
> Created an attachment (id=201015) [edit]
> POSIX compliant basename and dirname (without loops)
> 
> I think for efficiency reasons any loops should be avoided. Can you try if
> attached implementation catches all cases?

I tested it with all the different contrived paths I could think of, and it fits the posix spec for all of them.

Comment 10 Ulrich Müller gentoo-dev 2009-08-13 19:01:30 UTC
Created attachment 201168 [details]
basename and dirname, including unit tests

(In reply to comment #9)
> > If we change the way how the functions work, breakage may result.

> Yes, I was afraid of something like that.

OTOH, the documention in developer-guide.txt always stated that the functions are intended as "transparent replacements for the external applications". And I don't like declaring bugs to be features by documenting them. So let's go for POSIX compliance in eselect-1.2.x and fix any broken modules. New basename and dirname committed to SVN trunk (r603).

> >   1. Implement the POSIX compliant versions of the functions.
> >   2. Remove the functions completely, so that the real commands are used.
> >      May be less efficient.
> Aren't 1 & 2 basically the same thing, with potentially a bit less efficiency
> in 2?

One would hope so, but eselect has to run on a variety of systems (see bug 264734 for examples), so it's probably better if it includes its own implementation.

> > Can you try if attached implementation catches all cases?
> 
> I tested it with all the different contrived paths I could think of, and it
> fits the posix spec for all of them.

Thanks again.

Meanwhile I've also hacked up some unit tests with I'm attaching for reference. This includes all test cases from GNU coreutils.
Comment 11 Ulrich Müller gentoo-dev 2009-08-19 22:28:08 UTC
Fixed in eselect-1.2_rc1 (still package.masked though).