a simple test case shows an example of unexpand(1) going in the wrong direction. Its documented job is to replace spaces with equivalent tabs. In this case, it replaces a tab with a space(!). Test case: unexpand -t2 -a <<EOF a^Tb c EOF Output: a b c where ^T is a tab and the blank bits are a single space. Unexpand should be replacing spaces with tabs, but this example goes in the other direction. Quite undesirable, I think. Reproducible: Always Actual Results: tab -> space Expected Results: space -> tab (or possibly no change depending on how one interprets the man page). Happens on other Linux distros and architectures.
I couldn't reproduce this; as in bug 255343, we need more system info.
From the info manual: 9.3 `unexpand': Convert spaces to tabs ====================================== `unexpand' writes the contents of each given FILE, or standard input if none are given or for a FILE of `-', to standard output, converting blanks at the beginning of each line into as many tab characters as needed. In the default POSIX locale, a "blank" is a space or a tab; other locales may specify additional blank characters. [...] Looks like the info/man needs to be updated - unexpand does more than replacing characters (use tr for that instead). Did you report this bug at bug-coreutils@gnu.org (the man page suggests you do that).
(In reply to comment #2) > From the info manual: > > 9.3 `unexpand': Convert spaces to tabs > ====================================== > > `unexpand' writes the contents of each given FILE, or standard input if > none are given or for a FILE of `-', to standard output, converting > blanks at the beginning of each line into as many tab characters as > needed. In the default POSIX locale, a "blank" is a space or a tab; > other locales may specify additional blank characters. [...] > > Looks like the info/man needs to be updated - unexpand does more than replacing > characters (use tr for that instead). Did you report this bug at > bug-coreutils@gnu.org (the man page suggests you do that). > I did not report it to gnu.org. I've been confused for some time about bug reporting since there seem to be gentoo ebuilds that say don't report directly to upstream. I thought I would let wiser heads decide what should be done.
Created attachment 179361 [details] bash script to illustrate the bug When I run it I get kevin@treat Test $ bash tabtest.sh 0000000 61 09 62 20 63 0a a \t b c \n 0000006 0000000 61 20 62 20 63 0a a b c \n 0000006 kevin@treat Test $ Here's my emerge info: kevin@treat Test $ emerge --info Portage 2.1.6.4 (default/linux/x86/2008.0/desktop, gcc-4.1.2, glibc-2.6.1-r0, 2.6.27-gentoo-r7-kosmanor i686) ================================================================= System uname: Linux-2.6.27-gentoo-r7-kosmanor-i686-Intel-R-_XEON-TM-_CPU_1.80GHz-with-glibc2.0 Timestamp of tree: Thu, 22 Jan 2009 08:45:01 +0000 app-shells/bash: 3.2_p39 dev-java/java-config: 1.3.7-r1, 2.1.6-r1 dev-lang/python: 2.4.4-r14, 2.5.2-r7 dev-python/pycrypto: 2.0.1-r6 dev-util/cmake: 2.4.6-r1 sys-apps/baselayout: 1.12.11.1 sys-apps/sandbox: 1.2.18.1-r2 sys-devel/autoconf: 2.13, 2.63 sys-devel/automake: 1.5, 1.7.9-r1, 1.8.5-r3, 1.9.6-r2, 1.10.2 sys-devel/binutils: 2.18-r3 sys-devel/gcc-config: 1.4.0-r4 sys-devel/libtool: 1.5.26 virtual/os-headers: 2.6.27-r2 ACCEPT_KEYWORDS="x86" CBUILD="i686-pc-linux-gnu" CFLAGS="-Wall -g" CHOST="i686-pc-linux-gnu" CONFIG_PROTECT="/etc /usr/kde/3.5/env /usr/kde/3.5/share/config /usr/kde/3.5/shutdown /usr/share/config /var/bind /var/lib/hsqldb" CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/revdep-rebuild /etc/terminfo /etc/texmf/web2c /etc/udev/rules.d" CXXFLAGS="-O2 -march=pentium4 -fomit-frame-pointer -pipe -mfpmath=sse -msse2 -mmmx" DISTDIR="/usr/portage/distfiles" FEATURES="buildpkg distlocks fixpackages parallel-fetch protect-owned sandbox sfperms strict unmerge-orphans userfetch" GENTOO_MIRRORS="http://gentoo.osuosl.org/ ftp://ftp.gtlib.gatech.edu/pub/gentoo ftp://ftp.ucsb.edu/pub/mirrors/linux/gentoo/ http://ftp.ucsb.edu/pub/mirrors/linux/gentoo/ http://cudlug.cudenver.edu/gentoo/ http://gentoo.netnitco.net ftp://gentoo.netnitco.net/pub/mirrors/gentoo/source/ http://mirror.datapipe.net/gentoo " LANG="en_US.utf8" LC_ALL="en_US.utf8" LDFLAGS="-Wl,-O1" LINGUAS="en fr de es pl" MAKEOPTS="-j3" PKGDIR="/usr/portage/packages" PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages" PORTAGE_TMPDIR="/var/tmp" PORTDIR="/usr/portage" PORTDIR_OVERLAY="/usr/local/portage/overlay" SYNC="rsync://rsync.namerica.gentoo.org/gentoo-portage" USE="X Xaw3d acl acpi aim alsa apache2 apm bash-completion bcmath berkdb branding bzip2 cairo calendar caps cdr cli cracklib crypt cscope ctype cups dbm dbus dri dvd dvdr dvdread eds emboss encode esd evo exif fam fastcgi foomaticdb fortran gdbm gif gnome gphoto2 gpm gstreamer gtk guile hal iconv icq imap imlib ipv6 isdnlog java jbig joystick jpeg kde ldap libnotify libwww mad mailwrapper mbox mcal midi mikmod mime mmap mmx motif mp3 mpeg mpi mudflap mysql ncurses nis nls nptl nptlonly nsplugin odbc offensive ogg openal opengl openmp oscar pam pcre pdf perl pic png posix postgres ppds pppd python qt3 qt3support qt4 quicktime readline reflection ruby samba sdl session snmp sockets spell spl sse ssl startup-notification svg svga symlink sysfs sysvipc tcpd tetex tiff tk truetype unicode usb vorbis win32codecs x86 xml xorg xpm xulrunner xv yahoo zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1 emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic auth_digest authn_anon authn_dbd authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache dav dav_fs dav_lock dbd deflate dir disk_cache env expires ext_filter file_cache filter headers ident imagemap include info log_config logio mem_cache mime mime_magic negotiation proxy proxy_ajp proxy_balancer proxy_connect proxy_http rewrite setenvif so speling status unique_id userdir usertrack vhost_alias" ELIBC="glibc" INPUT_DEVICES="keyboard mouse joystick" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="en fr de es pl" USERLAND="GNU" VIDEO_CARDS="vga fbdev fglrx vesa" Unset: CPPFLAGS, CTARGET, EMERGE_DEFAULT_OPTS, FFLAGS, INSTALL_MASK, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS kevin@treat Test $
I reported this to gnu.org as well. The reply said they don't see the problem in the original code, but they'll add tests based on what I found. I downloaded a fresh copy of coreutils 6.12 and can confirm that this bug does not appear in the original of version 6.12. The GNU guy thought it might have been caused during the addition of i18n, but setting locales (LANG=C) had no effect for me, so I dunno.
This may not be an unexpand(1) bug exactly. It turns out that I was running with LANG and LC_ALL both set to "en_US.utf8". If I set them both to "C", the bugs go away. Unfortunately, that does not work well for me, and I use other locales as well (but haven't tested unexpand in them).
fixed in newer versions as we've dropped the utf8 patchset