255349 – sys-apps/coreutils-6.10-r2 unexpand(1) has inconsistent treatment of single spaces.

Bug 255349 - sys-apps/coreutils-6.10-r2 unexpand(1) has inconsistent treatment of single spaces.

Summary: sys-apps/coreutils-6.10-r2 unexpand(1) has inconsistent treatment of single s...

Status:	RESOLVED FIXED

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	[OLD] Core system (show other bugs)
Hardware:	All Linux

Importance:	High normal
Assignee:	Gentoo's Team for Core System packages

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2009-01-18 05:19 UTC by Kevin O'Gorman
Modified:	2009-02-22 00:15 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Kevin O'Gorman 2009-01-18 05:19:02 UTC

Single spaces leading up to a tab are treated inconsistently.  Sometimes they
are replaced by a tab and sometimes not.  The info page is vague enough to
allow either interpretation, but the variations seem undesirable.

If there's a good reason for the behavior, it should be documented.

I would note that the POSIX.1 man page is explicit, and allows only for changing
an initial sequence of blanks (not at issue here) or two or more blanks leading
up to a tab (also not at issue) so that a POSIX-compliant implementation would
not do conversions at all in the case of single blanks.  This seems consistent
with the motivation of making the file smaller, and avoiding changes that do
not further that end.

Test case: blanks are represented as periods (.) to avoid email mangling.

unexpand -t4 -a <<EOF | cat -t
abc.def..g
EOF
abc.def^T.g

The blank betweed "c" and "d" is not converted, but
the blank after "f" is converted to a tab (^T).
It is not at all clear why, since they both lead up to a tab stop.
One surmises that the following blank is making a difference, but it's
hard to see a motivation for the distinction.

I submit that it is just as well not to convert in both cases, as that
is most consistent with POSIX.

In any event, the documentation should be more clear about what cases are
handled and how.


Reproducible: Always

Comment 1 Kevin O'Gorman 2009-01-24 21:00:55 UTC

I reported this to gnu.org as well.  The reply said they don't see the problem
in the original code, but they'll add tests based on what I found.  I downloaded
a fresh copy of coreutils 6.12 and can confirm that this bug does not appear
in the original of version 6.12.

The GNU guy thought it might have been caused during the addition of i18n,
but setting locales (LANG=C) had no effect for me, so I dunno.

Comment 2 Kevin O'Gorman 2009-01-30 04:45:55 UTC

This may not be an unexpand(1) bug exactly.  It turns out that I was running
with LANG and LC_ALL both set to "en_US.utf8".  If I set them both to "C", the
bugs go away.  Unfortunately, that does not work well for me, and I use other
locales as well (but haven't tested unexpand in them).

Comment 3 SpanKY gentoo-dev

2009-02-22 00:15:56 UTC

fixed in newer versions as we've dropped the utf8 patchset