Summary: | dev-lang/python breaks Turkish capitalization rules | ||
---|---|---|---|
Product: | Gentoo Linux | Reporter: | Gokdeniz Karadag <gokdenizk> |
Component: | Current packages | Assignee: | Python Gentoo Team <python> |
Status: | RESOLVED UPSTREAM | ||
Severity: | normal | CC: | serkan |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | 250075 | ||
Bug Blocks: |
Description
Gokdeniz Karadag
2008-01-03 23:12:38 UTC
I really fail to see what are you expecting from us when this bug was already marked as invalid upstream, namely see http://bugs.python.org/msg55478. Hi, The bug was marked upstream because after setlocale is called, conversions are made correctly. On gentoo, even though I explicitly set locale, it does not convert the characters correctly. Does patch in bug #250075 which is now in Portage fix the issues? (In reply to comment #3) > Does patch in bug #250075 which is now in Portage fix the issues? > I have installed python-2.5.2-r8 which incorporates the patch in bug #250075, and the problem persists. Still the 'i' is capitalised as "dotless capital I" and not the correct "capital I with a dot above". The mentioned patch seems to fix problems with identifier names, the problem here is with plain unicode strings. Can this be a problem in python<->glibc interface ? Hello, Does this problem is still present with new stable dev-lang/python version?? Best regards, Yes, with python-2.6.2-r1 the bug is still there >>> import locale >>> locale.setlocale(locale.LC_ALL,"tr_TR.utf8") 'tr_TR.utf8' >>> repr(u"i".upper()) "u'I'" Where it must be "capital I with dot above" unicode character. >>> repr(u"İ") "u'\\u0130'" The C version still works correctly and displays "capital I with dot above". From what I understand from python bug *, the python version should do the same. I tried this on an ubuntu machine, it had the same error. I also tried on Pardus **, a distribution from Turkey, and python 2.6.2 interpreter there works correctly with the test above, repr(u"i".upper()) returns ---> u'\\u0130' *: http://bugs.python.org/issue1528802 **: http://pardus.org.tr/eng/ Pardus specific patches at *** seem to contain fixes for i-I problem in _identifier names only_ as the unicode string operations should work well within the correct locale. (That is what C library does, as shown in the C version, and python is said to call underlying C library functions) ***: http://packages.pardus.org.tr/info/2009/devel/source/python.html My wild guess is this is a bug in C library - python interface, but I don't have a practical way to test/debug this guess. Resolving as UPSTREAM. |