Summary: | setting LANG variable | ||
---|---|---|---|
Product: | [OLD] Docs-user | Reporter: | Seemant Kulleen (RETIRED) <seemant> |
Component: | Gentoo Linux x86 Installation Guide | Assignee: | Portage team <dev-portage> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | aalmenar, aether, avenj, azarah, carlo, cbradney, cjk, danarmak, david.morel, davidgrant, dberkholz, eiren, folken, foser, frederic.deghetto, grandmasterlinux, h3y, johnjohn-gentoo, jrmalaq, kyle, liquidx, liuspider, m.debruijne, mckenna, mr_bones_, radek, releng, rfujimoto, roman.majer, sascha-gentoo-bugzilla, satai, sindian, spider, svyatogor, tom.gl, utf8 |
Priority: | High | ||
Version: | unspecified | ||
Hardware: | x86 | ||
OS: | Linux | ||
Whiteboard: | |||
Package list: | Runtime testing required: | --- | |
Bug Depends on: | 57973 | ||
Bug Blocks: | 48449, 52013 | ||
Attachments: |
patch for portage.py
patch for portage.py patch for ebuild.sh patch for ebuild.sh Alternate patch for ebuild.sh Alternate for patch for ebuild.sh Revised Alternate Alternate patch for ebuild.sh Revised alternate for ebuild.sh Revised alternate patch for ebuild.sh Test case for the alternate patch. ebuild-linguas.patch |
Description
Seemant Kulleen (RETIRED)
2002-10-30 17:47:39 UTC
`ls /usr/share/locale/` ? What doc should this be added to? //ZhEN none yet, hang on. I'm sending a message to -core about this. Ok, I can do that ;-) //ZhEN Colin, is it possible to make an automatically generated list of LANGS like we did for USE from use.desc? //ZhEN What about adding some feature to portage, that it will run localpurge after 'install to ${D}' and remove everything not belonging to EN (??) and selected locales in LANG ?? Not sure about binary packages though ... Hmm, okay. Is this still a doc issue then? //ZhEN Az that kind of feature would completely rock. Things like the sylpheed-claws documentation translations (which get installed by default in 3 languages), and gtk2 stuff which installs every single language known to man, known to woman, and unknown to humans entirely, etc etc. just to show that is needed *** This bug has been marked as a duplicate of 9901 *** what is happening with this bug? seemant? //ZhEN there are three problems: 1) how to say in ebuild, that you want "your" nls option f.e. if I have USE="nls" the ebuild will download and install all language definitions known for this package (very frustrating for glibc) so how to say: I want "default (english) and cs (Czech)" locales 2) how to say to ebuild to prepare messages (glibc) in specific codepage 3) apps before some years apps have used 7bit coding (like ascii), for english it is enough, but not for european languages now in common they use 8bit coding (ISO-8859-X) future is Xbit coding (UTF like) Each of this problem must be solved in gentoo in near future... 1) can specify what locale I want 2) can specify codepage for locales 3) apps to be able use my codepage (mostly UTF problems) some Research on this problem how to setup locales for apps: LANG LC_ALL LC_XXX with format language[_COUNTRY[.codepage]] environment variables: man 7 locale definition of language: ISO 639 definition of country: ISO 3166 definition of codepage: some hint in /usr/src/linux/fs/nls/ some howtos: http://developpeur.journaldunet.com/ressource/howtos/Unicode-HOWTO-3.html in french http://new.linuxnow.com/docs/content/Unicode-HOWTO-html/Unicode-HOWTO.html in english nice howto from gentoo people: http://gentoo-deutsch.berlios.de/htmlfromxsl/guide-localization-en.html how to compile just one locale in glibc? http://www.gentoofr.org/gen.php/section/Documentation/8,0,1,0,0.html Hmm, okay - any progess on this? //zhen KDE take on this: The main kde + koffice have separate i18n packages in app-i18n (ebuild per language) which the user must emerge manually atm. If you give me somehing to use in PDEPEND, we can automatically pull in the right language pack(s) based on the locale after emerging kde/koffice. These packages are specified by language code (i.e. en_GB, he, fr, de). Many of the smaller kde apps have several translations packs distributed with them (for anything between 1 and 5 languages most of the time, nothing like the 50+ for kde proper). usually all of them are installed (I think), if you give me a list of languages in src_compile/_install I can control that. And, of course, unicode support is always present in qt/kde 3. We're only talking about tranlations of docs/help/interface/messages here. *** Bug 6758 has been marked as a duplicate of this bug. *** Gnome solves it by always installing nls, and falling back to always using the same codepath for texts, wether you specify a lang or not. I think we should leave instructions for setting LANG, but I can also comment on some packages that dont work well with LANG exported... typical is xmms, which wont play songs in directories that contain Gnome solves it by always installing nls, and falling back to always using the same codepath for texts, wether you specify a lang or not. I think we should leave instructions for setting LANG, but I can also comment on some packages that dont work well with LANG exported... typical is xmms, which wont play songs in directories that contain üäöøæ when you have exported LANG=en_GB . so we have a dilemma. openmotif has problems compiling in some locales with specific LANG settings... I have bug #10915 open which may give another perspective of a program with language/locale problems. We need to accept the fact that programs may not compile in a non "C" locale. I don't think we as Gentoo developers should be held responsible for making sure a package is fully internationalized, but if there are willing and able people who can help upstream maintainers fix that, that's cool. :) I think it would definitely be nice in rc.conf (or make.conf, whichever makes most sense) to give the user the option to set LANG settings there but with a big PHAT warning... even BIGGER and PHATTER than warnings about over-aggressive CFLAGS... That's WHY I suggest to NOT declare locale variables for root account. The root account should be "default" "C" locale and the problems with emerging (compilling) of packages will be solved. Locale variables should be declared ONLY for user accounts! BUT the locale settings in make.conf can be used for default skeleton for creating user directories. Dont use LANG, but rather ELANG ? Portage then use this to filter catalogs/whatever? What is the progress on this? I am not sure that I am the one to handle this, following the comments... //zhen i will put this in my pile of stuff to keep an eye on.. so, what happened finally with this bug ? *** Bug 12923 has been marked as a duplicate of this bug. *** basically my idea is to have an ACCEPTED_LANGS and MAIN_LANG variables in make.conf or rc.conf (perhaps rc.conf?). i'm not sure where this bug is focusing on whether to setup a default (reasonable) LANG variable or to now allow the setting of multiple languages. if it is still focusing on setting a global system-wide LANG variable, then it is related to bug #7596 and also there is a docs bug #20954 to clarify the situation with LANG. i've just spent a good chunk of time tracking down an evolution bug which ended up to be caused by the lack of env var LANG. this is not the first app to be bitten by this problem, as i know xfree has problems with deadkeys and some other apps had problems (iirc, the now defunct rhythmbox.) therefore i support setting LANG as a system wide variable, or at least a sensible default for new installations, and making it clear in the install docs that the users SHOULD set their own LANG environment (in rc.conf?) rc.conf is probably as good a place as any, because the KEYMAP, CONSOLEFONT and CONSOLETRANSALTION variables are set there and they are related. However, it's not a trivial thing to organise and it's hell for users. Mandrake has a /etc/sysconfig/i18n file: mine was set for United Kingdom but it made file sorting crazy until I altered it so: SYSFONTACM=iso15 LC_CTYPE=en_GB LANGUAGE=en_GB:en LC_MONETARY=en_GB LC_COLLATE=POSIX <------ to correct crazy sorting LC_NUMERIC=en_GB SYSFONT=lat0-16 LC_TIME=en_GB LANG=en_GB LC_MESSAGES=en_GB Just to add a cross-reference, a posting of mine, some time ago: http://forums.gentoo.org/viewtopic.php?p=386839#386839 Everybody here talks about LANG and LANGS and so on, but nobody mentiones LINGUAS, the "standart" GNU gettext way of specifying things. Kalin. kalin, do you have a link to GNU's explanation of it? Hi all I implemented two features to resolve this bug. By defining USE_LANG settings, we can configure ebuild. First feature, addition lang/primary_lang function to ebuild. Example: 1. export USE_LANG='ja ko' (or in make.conf) 2. in ebuild lang ja && epatch ${FILES}/japanese.patch lang ko && epatch ${FILES}/korean.patch If Japanese patch and Korean patch are conflict, we can use primary_lang function. This function evaluates only the first of USE_LANG. In this case, only Japanese patch is applied. primary_lang ja && epatch ${FILES}/japanese.patch primary_lang ko && epatch ${FILES}/korean.patch Second, to remove unnecessary language files(man/locale files). All man/locale files are installed at the moment. However I need only Japanese and English files :) Example: 1. export USE_LANG='ja' (or in make.conf) 2. Then I have only Japanese and defualt(English) man/locale files after emerge. Any idea? P.S. 'USE_LANG' name is not good idea? Created attachment 14532 [details, diff]
patch for portage.py
Created attachment 14533 [details, diff]
patch for portage.py
Sorry, I attached portage.py not patch. This is the patch.
Created attachment 14534 [details, diff]
patch for ebuild.sh
Looks good, except one thing. In remove_unnecessary_lang() , you go through ${D}/usr/share/locale and match the languages that are not in USE_LANG. But according to your example, if you have: USE_LANG="ja ko" then it will still remove those that a name ja_JP ja_JP.SJIS ... Maybe it is better if you keep the directory if the substring matches the beginning of the language code, so "ja" would keep ja, ja_JP, ja_JP.SJIS ? Thanks, liquidx. I've fixed it. And I added FEATURES='noallman noalllocale'. If you set these flags, emerge removes man/locale files except your lang. Please test it. Created attachment 14579 [details, diff]
patch for ebuild.sh
Nakano-san konnichiwa (^^; I think the PRIMARY_LANG idea might be a bit limited. Say for example there are 2 patches, one for ko and one for cn that conflict. Say I set USE_LANG="ja ko cn" in this case, it would seem to me that as long as ko didn't conflict with ja, one would still want ko to be applied. The question is how one would query the preference order. I was thinking something like lang_prefer ko cn && epatch ko.patch lang_prefer cn ko && epatch cn.patch where lang_prefer means: true if the first argument is preferred over all other arguments In the case where only two patches conflict with one another this makes sense. However if you get in to say 3 or more you might end up with lang_prefer ko cn ja en uk && epatch ko.patch lang_prefer cn ko ja en uk && epatch ... lang_prefer ja ... && ... lang_prefer en ... && ... lang_prefer uk ... && ... Seems like you just want to be able to say "these patches conflict, please the largest subset without causing conflict according to my preferences" (and not specific to i18n patches either) just my 2 cents *** Bug 22669 has been marked as a duplicate of this bug. *** Created attachment 15111 [details, diff]
Alternate patch for ebuild.sh
This has not been fully tested yet, posted purely for consideration of the
design, I will post again once I have tested it more.
Created attachment 15112 [details, diff]
Alternate for patch for ebuild.sh
Already noticed a typo, all apologies.
Note the alternate patch for ebuild.sh also retabifies ebuild.sh to all tabs !! --instead of some places where it's 4 spaces for a tab-- I have tested this patch against man-1.5l-r6.ebuild, but it just can not identify any man pages which should be removed. I find out that the variable ${MANDIR} in function remove_man_files is empty :( Created attachment 15136 [details, diff]
Revised Alternate
Jason, I saw your patch. Thank you for adding comments etc. But I think some people need all man files/locales even if he/she set USE_LANG. SO I've proposed adding new a variable on #25296 Please see it. Created attachment 15143 [details, diff]
Alternate patch for ebuild.sh
Ok, tested this a bit more, seems to be in working order now.
Ok, so a FILTER variable, or as in the orignal patch FEATURES could include noallman, and/or noalllocale, the alternate patch was added at the request of Seemant Kulleen in which he specified how removing of the man/locale files should behave. It's seems a bit less customizable but perhaps a simpler way to remove all unnecessary stuff. Created attachment 15148 [details, diff]
Revised alternate for ebuild.sh
This patch works fine with man & net-tools hopes this can be merged into next version thanks But I do not know whether this patch can deal with locale correctly. No package comes into my mind other than glibc which has locales and can be used to test this patch. But glibc is too large to compile ;), so I do not know whether it can remove unneccesary locales I believe the man ebuild does have locale files, as I thought I saw some being removed when I installed it, adding a "v" to the rm command options in the remove_* commands is helpful, as it'll show what's being removed, currently I'm using this patch to build a chroot enviroment, so if anything arises I'll be sure to post it. Created attachment 15168 [details, diff]
Revised alternate patch for ebuild.sh
Created attachment 15169 [details]
Test case for the alternate patch.
Comment on attachment 15169 [details]
Test case for the alternate patch.
This test has rm -vrf in the ebuild.sh to illustrate what's being removed, the
latest patch does not.
Oh, yes, the man packages does have locale files, thanks. And I think currently the new ebuild will keep all zh* if I set my USE_LANG="zh", but what I really need is only zh_CN.GBK, zh_CN.GB18030 and zh_CN.UTF-8, not including zh_TW.* and other zh_CN.*. So maybe it's better to use exactly match in the function remove_man_files and remove_locale_files, not just compare the first 2 characters? first 2 charactor is language code then country/region code. USE_LANG is set language code so it compare the first 2 charactor at the moment. If we want to specify a country/region code, we need to make new variable USE_LOCALE instead of USE_LANG. examples. USE_LOCALE="zn" or USE_LOCALE="zn*" or USE_LOCALE="zh_CN" just idea. Is that need a special variable? In my opinion using wildcards in USE_LANG is sufficient to archieve this goal. Just: USE_LANG="zh_CN.GBK zh_CN.UTF-8" or USE_LANG="zh_CN.*" or USE_LANG="zh*" I am inclined to agree with liu on this I just want to rename USE_LANG to USE_LOCALE. Well, Comment #28 asked for reference to GNU gettext, here is one: http://www.gnu.org/software/gettext/gettext.html For the moment, I still have this in my /etc/make.conf # packages that don't work with LINGUAS: # break: sharutils atk gtk+-2.2.1 # ignore: wget man net-tools cups LINGUAS="en ja ru bg de" and comment the LINGUAS line when building "break" packages" *** Bug 1815 has been marked as a duplicate of this bug. *** *** Bug 2121 has been marked as a duplicate of this bug. *** *** Bug 7569 has been marked as a duplicate of this bug. *** *** Bug 12065 has been marked as a duplicate of this bug. *** *** Bug 14096 has been marked as a duplicate of this bug. *** *** Bug 32217 has been marked as a duplicate of this bug. *** This issue has recently come up with Gentoo users of Scribus. From what I have been told, Qt calls with regards to languages are used to set the language of a Scribus document. When LANG is not set, there are issues with certain characters in the documents. As soon as you export LANG (Ive done it in .bashrc), the Scribus issue goes away. So, any progress on this one? This might be of use re Qt: http://doc.trolltech.com/3.2/linguist-manual-4.html It shows how the translate calls are made for different languages etc. i think this is done, isn't it? Oh, really? If so, it's good to know that. We all are waiting for this feature to be implemented. I cannot find any USE_LOCALE nor USE_LANG under /usr/lib/portage though (portage-2.0.50-r1). I would like to add Japanese version of openoffice-bin once this issue has been solved. is this also documented and default then? is a skeleton file installed somewhere? (+nls flag in baselayout strikes me as a good place) and also setting LANG="en_US" if nothing else? please note that "POSIX" or "C" are considered bad to set as defaults. kde-i18n, koffice-i18n and k3b already use the linguas use variable Yup, but where do these linguas_?? IUSE flags come from? Also, do portage developers agree to use LINGUAS variable to define system language environment? (LANG is for user language env) And Qt's language functions only get from LANG... no idea why but it is so. I guess for cross platform portability. I'm not at all in favour of official support, that means if it breaks some builds (and i know it does) we get to fix it and I'm certainly not ready for that. Users can do this fine on their local systems if they want to, but Gentoo wide support.. nah... I'm still waiting for ranged deps, that's way more important than this superficial feature. ¿Superficial feature? Maybe for you, who are native english speaker. Sure ranged dependencies are good, but i think this and the binary ebuilds are the very first lacks of Gentoo. This is one of the things that are currently soooo wrong in Gentoo. For example compiling all the locales is *one of the most time consuming periods* during the bootstrap - and for what? To waste my disk space and nothing more!! At comment 54 someone said about checking 2 characters for language codes only? I know this might be a bit pedantic, but I think that for some rare languages there are only 3 letter language codes available, which might make a problem. *** Bug 56586 has been marked as a duplicate of this bug. *** Created attachment 42448 [details, diff] ebuild-linguas.patch A bit shorter, and uses the LINGUAS variable. It takes a slightly different approach that makes it work with xx as well as xx_XX linguas. In reply to comment 76: This should also work with xxx or longer linguas. The directories in my /usr/share/locale this applies to are only chef, fudd, piglatin and valley, though; nothing to be taken seriously. OK, to get things done, i'll try to summarize everything: This bug proposes some features related to non-English languages. The base of all those features is a variable called LINGUAS. This variable should contain all languages you want to install on your system, the ones with higher priority first, e.g. LINGUAS="de_DE.UTF8 cz en_GB@Euro" and will be defined in /etc/rc.conf. The LINGUAS variable will follow ISO 639 and ISO 3166. Those are the features that will be introduced with the LINGUAS variable: 1.) For working with LINGUAS in ebuilds, portage will get two new functions: 1.1) lang (e.g. `lang en` will return true) 1.2) primary_lang (e.g. `primary_lang de` will return true) 2.) a function to remove unused man pages, activated by FEATURES="notallmanpages" 3.) a function to remove unused locales, activated by FEATURS="notalllocales" 4.) based on the first entry in LINGUAS, a system locale should be set in /etc/bash/bashrc. The most important point in my opinion is 1. It's easy to implement (see attached patches for ebuild, they only need to be changed a bit) and will break nothing. For 2 and 3 we still have to decide howto match the locales, see comments #78, #54. 4 is a bit complicated because portage will break on some packages, as long as it does not export LANG=C and some ppl are not happy with doing that, see bug #57973. I don't like 2) and 3) as FEATURES, but they can probably be implemented in a different way. (2) and (3) are already handled by app-admin/localepurge imho ... no point in bloating portage further with more FEATURES 1 could also implemented using an eclass, i don't know which way is better i already implemented a 'strip-linguas' function in eutils.eclass a while ago because i kept finding packages that would fail to build when LINUGAS contained entries that the package didnt support ... so that could possibly be used as a start for an eclass might i suggest the following for portage though ... sometimes packages can have SRC_URI/DEPEND/etc based upon LINGUAS/LANG ... can we have implied USE flags based upon the value of LANG ? or perhaps something like what we have right now with IUSE_VIDEO_CARDS ... for example: LANG="en de" then ebuild's could use lang_en? ( ) lang_de? ( ) that's already in portage, you can use "use linguas_xx" well then, i'm an idiot ok, in response to (4) then, why cant we just make users set it themselves ? why does portage have to get involved ? > (2) and (3) are already handled by app-admin/localepurge imho ... no point in bloating portage further with more FEATURES
That makes qpkg -c / equery check practically unusable.
So, anything left here? Looks to be a nonissue now... |