#include <libgen.h> #include <stdio.h> #include <stdlib.h> #include <string.h> int main(){ #define testpath "/usr/bin/dirname" char *path=malloc(sizeof(testpath)); strcpy(path,testpath); char *pathdirname=dirname(path); puts(pathdirname); free(path); puts(pathdirname); return EXIT_SUCCESS; } The above program should print two identical strings in the screen. But it prints only one. freeing the string path should not modify the string pathdirname. There is nothing on the corresponding manual page (accessible via man 3 dirname) that mentions this. Either the function dirname() must be modified, or its documentation. Running under gdb, I can see that path and pathdirname point to the same string. This explains the problem. It seems that the dirname function modifies its argument and returns a pointer to its argument. Let me reiterate that this is not mentioned in the documentation. Reproducible: Always Steps to Reproduce: 1.Compile the program above with gcc -Wall -Wextra -o dirname dirname.c 2. run the produced executable 3. Notice that it prints the dirname /usr/bin only once, not twice Actual Results: The program prints the dirname /usr/bin only once, not twice Expected Results: The program should print /usr/bin twice. Portage 2.1.2.2 (default-linux/x86/2006.1/desktop, gcc-4.1.1, glibc-2.5-r0, 2.6.19-gentoo-r5 i686) ================================================================= System uname: 2.6.19-gentoo-r5 i686 AMD Athlon(tm) XP 2600+ Gentoo Base System release 1.12.9 Timestamp of tree: Sat, 10 Mar 2007 21:30:08 +0000 dev-java/java-config: 1.3.7, 2.0.31 dev-lang/python: 2.4.3-r4 dev-python/pycrypto: 2.0.1-r5 sys-apps/sandbox: 1.2.17 sys-devel/autoconf: 2.13, 2.61 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10 sys-devel/binutils: 2.16.1-r3 sys-devel/gcc-config: 1.3.14 sys-devel/libtool: 1.5.22 virtual/os-headers: 2.6.17-r2 ACCEPT_KEYWORDS="x86" AUTOCLEAN="yes" CBUILD="i686-pc-linux-gnu" CFLAGS="-O2 -march=athlon-xp -pipe -fomit-frame-pointer -fgcse-after-reload -fweb " CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/X11/xkb /usr/share/config" CONFIG_PROTECT_MASK="/etc/env.d /etc/env.d/java/ /etc/gconf /etc/java-config/vms/ /etc/revdep-rebuild /etc/terminfo" CXXFLAGS="-O2 -march=athlon-xp -pipe -fomit-frame-pointer -fgcse-after-reload -fweb " DISTDIR="/usr/portage/distfiles" EMERGE_DEFAULT_OPTS="--nospinner" FEATURES="autoconfig distlocks metadata-transfer parallel-fetch sandbox sfperms strict" GENTOO_MIRRORS="http://gentoo.localhost.net.ar/ http://www.las.ic.unicamp.br/pub/gentoo/ http://gentoo.osuosl.org/ http://mirrors.usu.edu/mirrors/gentoo/ http://distro.ibiblio.org/pub/linux/distributions/gentoo/" LANG="en_US" LC_ALL="en_US" MAKEOPTS="-j2" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --delete-after --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages --filter=H_**/files/digest-*" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/portage/local/layman/initng /usr/portage/local/layman/enlightenment /usr/local/jorge-portage" SYNC="rsync://rsync.gentoo.org/gentoo-portage" USE="3dnow 3dnowext 7zip X aac aalib alsa aotuv bash-completion bzip2 cairo cdr cracklib dbus divx dri emacs emacs-w3 encode fam firefox fortran gcj gcl gif glibc-omitfp gmedia gnuplot gpm gstreamer gtk2 gzip-el iconv ipv6 jpeg kdeenablefinal kdehiddenvisibility lesstif libcaca lirc mad matroska midi mikmod mmx mmxext mng moznopango mp3 mpeg ncurses nptl nptlonly nsplugin nvidia offensive ogg opengl pam pcre pdf png ppds pppd psyco python quicktime readline realmedia rtc sdl session sox spell sse ssl svg symlink tcltk tcpd theora tk toolkit-scroll-bars truetype truetype-fonts type1-fonts unicode usb v4l2 vorbis wmp x86 xml xorg xv zlib" ALSA_CARDS="cmipci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mulaw multi null plug rate route share shm softvol" ELIBC="glibc" INITNG_PLUGINS="also bash_launcher chdir chroot conflict cpout critical cron daemon dev dllaunch envparser find fstat history idleprobe initctl interactive iparser last limit logfile netprobe ngc4 pause pidfile reload renice rlparser simple_launcher stcmd stdout suid syncron syslog unneeded provide" INPUT_DEVICES="keyboard mouse evdev" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LIRC_DEVICES="inputlirc" USERLAND="GNU" VIDEO_CARDS="nv" Unset: CTARGET, INSTALL_MASK, LDFLAGS, LINGUAS, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS
man 3 dirname contains the paragraph: Both dirname() and basename() may modify the contents of path, so copies should be passed to these functions. Furthermore, dirname() and basename() may return pointers to statically allocated memory which may be overwritten by subsequent calls. Which seems clear enough. It means you cannot assume what you are assuming about dirname's behaviour. The manual explicitly says you should make copies of strings passed to dirname(); you must do this if you wish to retain the original string. Here's the code example from 'man 3 dirname' showing how to do this: EXAMPLE char *dirc, *basec, *bname, *dname; char *path = "/etc/passwd"; dirc = strdup(path); basec = strdup(path); dname = dirname(dirc); bname = basename(basec); printf("dirname=%s, basename=%s\n", dname, bname);
The manual says that the function may modify its argument so that I should pass a copy to it. Which I did. The manual does not mention that the function returns a pointer to the very argument, so that a modification of the argument results in a modification of the result. I'm reopening the bug. I don't know if this is the appropriate thing to do. If it is not, tell me. Of course, this is the last time I'll reopen the bug, so if you want it to stay closed, just close it once more.
Let me explain why do I think the documentation is bad. After reading the documentation, it seemed to me that a good way to work with dirname() would be char *pathcopy=strdup(path); if (pathcopy==NULL){ /*handle_error*/ }else{ char * pathdirname=dirname(pathcopy); free(pathcopy); /*do something with pathdirname*/ } I tried something similar to that, and it did not work, because once pathcopy is freed, pathdirname becomes bogus. This is unpredictable and surprising for someone that has read the documentation. I provided to dirname() a copy of path, just as the documentation asked, but the documentation never told me that I could not free the copy and go on working with the returned result.
You did not provide dirname() with a copy of the path. All you did was copy the pointer to it. Copying a pointer does not copy the data - that is elementary C (which is itself a very elementary programming language, you have to do almost everything explicitly in C). If you look at the example in the man page, it illustrates clearly how to copy the argument using strdup(). Believe me, there's no bug here; everything is behaving as specified, and the manual page clearly describes the behaviour of dirname(), and how to use it - even so far as to provide example code to make things perfectly clear.
In my first example, I allocated new memory using malloc() and copied the string using strcpy(). This copies the data, not the pointer. In the second example, I used strdup. strdup copies the data, not the pointer.
Well, you copied testpath to path - that's fine, but you never used testpath afterwards anyway. You used pathdirname, which you set to a copy of the pointer 'path' (via the result of dirname()). You seem to be expecting pathdirname to get a copy of the path (including contents), which you won't get. dirname() certainly isn't going to allocate memory, duplicate the path contents, and return a pointer to the copy. It's left to you as the caller to manage whether a copy is needed or not. Instead of: char *pathdirname=dirname(pathcopy); you could do: char *pathdirname=strdup(dirname(path))); after which path is modified (by dirname), pathdirname contains a copy of the result (not just a pointer to the modified path).
>Well, you copied testpath to path - that's fine, but you never used testpath >afterwards anyway. I copied testpath to path because if I called dirname directly on testpath I would probably get a segfault. dirname() should not be called on a string literal, because dirname() may atempt to modify its argument. >You used pathdirname, which you set to a copy of the >pointer 'path' (via the result of dirname()). I'm aware of this. I found this out after running with gdb. It is not mentioned in the man page. >You seem to be expecting pathdirname to get a copy of the path (including >contents) Yes. >, which you won't get. dirname() certainly isn't going to allocate >memory, duplicate the path contents, and return a pointer to the copy. It seemed to me that dirname() could do one of three things: 1)allocate memory with malloc, put the result into it, and return a pointer to it. 2)put the result into a static buffer, such as a local array declared with the static keyword, or a global array and return a pointer to this buffer. 3)return a pointer to its argument. The man page said that the function may return a pointer to statically allocated memory which may be overwritten by subsequent calls. This made me think that possibility 2) was happening. To me, possibility 3) seems quite surprising, and should be mentioned on the man page. In fact, any special behavior of any function should be mentioned in its man page. The programmer should not be required to make tests and/or read the function source code to understand its behavior. And in my opinion, the fact that the returned result is modified if the argument is modified is special behavior. > char *pathdirname=strdup(dirname(path))); Yes, this is what I am doing now in my code. Anyway, if you don't see it like me, I'll just ignore this issue. Also, I took the liberty of emailing my suggestion to the email I found on the page http://www.win.tue.nl/~aeb/linux/man/ (which I found on the man-pages ebuild). Thank you for your attention.
tbh, it doesnt matter if you dislike the behavior of dirname() ... the things you dislike are allowed by spec which means your code needs to account for these things, not the specific implementation: http://www.opengroup.org/onlinepubs/009695399/functions/dirname.html while yes, i too agree that the spec sucks for this function, Gentoo is not the forum for such things
(In reply to comment #7) > It seemed to me that dirname() could do one of three things: > 1)allocate memory with malloc, put the result into it, and return a pointer > to it. This is not something the spec allows for - otherwise it would tell you that you need to free the returned pointer once you are finished with it (or would explicitly say it dynamically allocates the returned memory). Since it doesn't tell you that, and even says the pointer may reference a static buffer, you cannot free() it. The logical result is that dirname() therefore cannot malloc memory for the result. > 2)put the result into a static buffer, such as a local array declared with > the static keyword, or a global array and return a pointer to this buffer. This is indeed a possibility. It's unlikely in a modern implementation, as it implies there is a low length limit to the paths that dirname() can handle (since the static buffer must have a pre-defined size and must be reserved solely for the use of dirname, so is unlikely to be large). > 3)return a pointer to its argument. Apart from the static buffer method, this is the only way it can be implemented to conform to the spec (because dynamically allocating memory is not permitted as described above). My guess is that the authors assumed (correctly or not) that a reader would realise this is possible and has to be catered for by the caller. Other functions that exhibit the same behaviour off the top of my head include strtok(), strsep(), strfry(). The severity of the warnings in each case varies - but like dirname() they all explicitly say the parameter may be modified (and hence can segfault if passed a string literal). > Anyway, if you don't see it like me, I'll just ignore this issue. > > Also, I took the liberty of emailing my suggestion to the email I found on the > page http://www.win.tue.nl/~aeb/linux/man/ (which I found on the man-pages > ebuild). That's probably best in the end. We don't maintain the man pages, just integrate them into the distribution (bugging here in the first instance is always best, in case an issue is a result of something we've modified or something else we do). It's worth noting that the specification for many of these functions is simply an explicit documentation of existing behaviour, written to allow for the existing implementations on many Unix systems, some of which date back to the 1960's BTW your bug report is appreciated, even if we decide to do nothing as a result.