Summary: | Uppercase/lowecase don't work correctly with locale it_IT.utf8 | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Daniele Varrazzo <daniele.varrazzo> |
Component: | [OLD] Core system | Assignee: | Gentoo Toolchain Maintainers <toolchain> |
Status: | RESOLVED NEEDINFO | ||
Severity: | normal | ||
Priority: | High | ||
Version: | unspecified | ||
Hardware: | x86 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Attachments: | An utf-8 encoded file containing a query that should be return "t" |
Description
Daniele Varrazzo
2007-03-14 16:56:24 UTC
Created attachment 113265 [details]
An utf-8 encoded file containing a query that should be return "t"
Aorry, the Python example is bogus. The postgres one should not be anyway: it seems a lc_ctype problem. I attached a file showing the bug. If the database is created with --encoding=utf8 --locale=it_IT.utf8, the test fails: $ psql postgres < test.utf8 ?column? ---------- f (1 row) The test passes if the database is created with --encoding=latin1 --locale=it_IT iconv -f utf8 -t latin1 < test | psql postgres ?column? ---------- t (1 row) i would guess that your terminal is causing this inconsistency ... your terminal needs to be set up for both UTF8 input/output in order for this test to be valid ... by changing in a shell on the fly via `export LC_ALL`, you would get weird behavior in pretty much all terminals (In reply to comment #3) > i would guess that your terminal is causing this inconsistency ... > > your terminal needs to be set up for both UTF8 input/output in order for this > test to be valid ... by changing in a shell on the fly via `export LC_ALL`, you > would get weird behavior in pretty much all terminals I've been careful to not be fooled by the console encoding. Anyway i performed other tests and it seems a problem limited to PostgreSQL and not to the C libaries. I verified the problem with the en_US.utf8 locale too. I reported the bug to the PostgreSQL team with the following test, which is entirely ascii (the query is supposed to return 't'). $ initdb --encoding=utf8 --locale=en_US.utf8 en_utf8 $ pg_ctl -D en_utf8 start $ psql postgres postgres=# SELECT upper('\xc3\xa8') ILIKE '\xc3\xa8'; ?column? ---------- f |