208884 – sys-apps/grep is locale-sensitive

Bug 208884 - sys-apps/grep is locale-sensitive

Summary: sys-apps/grep is locale-sensitive

Status:	RESOLVED INVALID

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	[OLD] Core system (show other bugs)
Hardware:	All Linux

Importance:	High minor
Assignee:	Gentoo Linux bug wranglers

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-02-04 20:22 UTC by Mark
Modified:	2008-02-05 20:13 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Mark 2008-02-04 20:22:23 UTC

When using grep under LANG=en_GB.UTF8 it searches insensitive to case.

$ export LANG=en_GB.UTF-8
$ echo ABC | grep [a-z]
ABC
$ echo abc | grep [A-Z]
abc

If I unset the LANG, then it returns to 'normal' behaviour.

$ unset LANG
$ echo ABC | grep [a-z]
$ echo abc | grep [A-Z]

Needless to say this is very counter intuitive.

It seems sed is also affected see bug 208051.

Reading in the "man grep" it says:
"Many locales sort characters in dictionary order, and in these locales [a-d] is typically not  equivalent  to [abcd];  it  might  be equivalent to [aBbCcDd], for example.  To obtain the traditional interpretation of bracket expressions, you can use  the C locale by setting the LC_ALL environment variable to the value C."

So OK, obviously the UTF8 locale is one of these cases. But do we truly want grep to behave in this manner by default? Surely it is more sensible to have it behave under the default collating method.

I put this as a request to have LC_COLLATE="C" put into the profile to remove this odd default.

PS: I see that comment #4 on bug 208051 suggests that reading the documentation is enough, and I agree it does describe the issue but the question is what do we want as a default. Thanks!

Comment 1 Jakub Moc (RETIRED) gentoo-dev

2008-02-04 21:15:42 UTC

[A-Z][a-z] is locale-specific; you need [[:upper:]] or [[:lower:]] or whatnot, and please stop filing support requests in bugzilla.

Comment 2 Mark 2008-02-05 19:55:12 UTC

I don't actually see anywhere that I am asking for "support" with grep? Instead I see a question regarding the default behaviour of grep and whether or not someone might want to alter the profile to stop this confusing behaviour.

I can assure you that I would file my support requests in a forum where they can be discussed, not in bugzilla.

This is not a bug with grep, this is a bug with the default environment of gentoo.

Comment 3 Jakub Moc (RETIRED) gentoo-dev

2008-02-05 20:13:08 UTC

No, there's no bug with default environment of gentoo. Locales are *completely* something for user to configure.