Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 57973 - Portage should overriding LANG vars
Summary: Portage should overriding LANG vars
Status: RESOLVED WONTFIX
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Conceptual/Abstract Ideas (show other bugs)
Hardware: All Linux
: High major (vote)
Assignee: Portage team
URL:
Whiteboard:
Keywords:
: 9901 41947 133758 (view as bug list)
Depends on:
Blocks: 9988
  Show dependency tree
 
Reported: 2004-07-22 09:42 UTC by Martin Probst
Modified: 2006-05-19 11:36 UTC (History)
7 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Martin Probst 2004-07-22 09:42:54 UTC
I switched my whole system to unicode a while ago. Everything is working fine except this bug I ran into several times:
When emerging a program, e.g. subversion, if source files include non-utf8 non-ascii characters (e.g. iso-8859-1 characters _and_ include special characters like German Umlaute) it fails on these.
I thought the fast workaround was to do:
LANG=de_DE emerge foo
but this doesn't work. portage seems to source /etc/profile or something similar. At the moment this means either switching locale (edit /etc/env.d/03locale, env-update, emerge foo and back again) or manually converting the source files. Not good.
I also experienced a similar problem when setting the LANGUAGE variable in there, this breaks the OpenOffice ebuild even if I set LANGUAGE to something like GER within /etc/make.conf.

Just to demonstrate:
perseus subversion-1.0.6 # locale
LANG=de_DE.UTF-8
LC_CTYPE="de_DE.UTF-8"
LC_NUMERIC="de_DE.UTF-8"
LC_TIME="de_DE.UTF-8"
LC_COLLATE=C
LC_MONETARY="de_DE.UTF-8"
LC_MESSAGES="de_DE.UTF-8"
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=

perseus subversion-1.0.6 # LANG=de_DE emerge subversion
Calculating dependencies ...done!
>>> emerge (1 of 1) dev-util/subversion-1.0.6 to /
>>> md5 src_uri ;-) subversion-1.0.6.tar.bz2

make[1]: Entering directory `/var/tmp/portage/subversion-1.0.6/work/subversion-1.0.6/subversion/bindings/java/javahl/src'
CLASSPATH=../cls:./../cls:$CLASSPATH /opt/sun-jdk-1.5.0_beta2/bin/javac -d ../cls -g  org/tigris/subversion/javahl/BlameCallback.java org/tigris/subversion/javahl/ClientException.java org/tigris/subversion/javahl/DirEntry.java org/tigris/subversion/javahl/JNIError.java org/tigris/subversion/javahl/LogMessage.java org/tigris/subversion/javahl/NodeKind.java org/tigris/subversion/javahl/Notify.java org/tigris/subversion/javahl/PromptUserPassword.java org/tigris/subversion/javahl/PromptUserPassword2.java org/tigris/subversion/javahl/PromptUserPassword3.java org/tigris/subversion/javahl/PropertyData.java org/tigris/subversion/javahl/Revision.java org/tigris/subversion/javahl/SVNClient.java org/tigris/subversion/javahl/SVNClientInterface.java org/tigris/subversion/javahl/SVNClientSynchronized.java org/tigris/subversion/javahl/Status.java
org/tigris/subversion/javahl/DirEntry.java:28: unmappable character for encoding UTF8
 * @author C�dric Chabanois
            ^
org/tigris/subversion/javahl/Status.java:25: unmappable character for encoding UTF8
 * @author C�dric Chabanois


Reproducible: Always
Steps to Reproduce:
1. Set up your system to use unicode (LANG=de_DE.UTF-8)
2. emerge a package containing sources with ISO-8859-1 contents
3. watch the errors...

Actual Results:  
Build is failing

Expected Results:  
Build should work, at least when doing LANG=de_DE emerge foo

Portage 2.0.50-r9 (default-x86-1.4, gcc-3.3.3, glibc-2.3.3.20040420-r0, 2.6.7-
gentoo-r9)
=================================================================
System uname: 2.6.7-gentoo-r9 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz
Gentoo Base System version 1.4.16
distcc 2.13 i686-pc-linux-gnu (protocols 1 and 2) (default port 3632) [enabled]
Autoconf: sys-devel/autoconf-2.59-r3
Automake: sys-devel/automake-1.8.3
ACCEPT_KEYWORDS="x86"
AUTOCLEAN="yes"
CFLAGS="-march=pentium4 -O3 -funroll-loops -fomit-frame-pointer -pipe"
CHOST="i686-pc-linux-gnu"
COMPILER="gcc3"
CONFIG_PROTECT="/etc /usr/X11R6/lib/X11/xkb /usr/kde/2/share/config /usr/kde/3.
1/share/config /usr/kde/3.2/share/config /usr/kde/3/share/config /usr/lib/
mozilla/defaults/pref /usr/share/config /usr/share/texmf/dvipdfm/config/ /usr/
share/texmf/dvips/config/ /usr/share/texmf/tex/generic/config/ /usr/share/texmf/
tex/platex/config/ /usr/share/texmf/xdvi/ /var/qmail/control"
CONFIG_PROTECT_MASK="/etc/gconf /etc/terminfo /etc/env.d"
CXXFLAGS="-march=pentium4 -O3 -funroll-loops -fomit-frame-pointer -pipe"
DISTDIR="/usr/portage/distfiles"
FEATURES="autoaddcvs ccache distcc"
GENTOO_MIRRORS="ftp://sunsite.informatik.rwth-aachen.de/pub/Linux/gentoo ftp://
ftp.tu-clausthal.de/pub/linux/gentoo/ http://linux.rz.ruhr-uni-bochum.de/
download/gentoo-mirror/ ftp://linux.rz.ruhr-uni-bochum.de/gentoo-mirror/ http://
mirrors.sec.informatik.tu-darmstadt.de/gentoo http://ftp.uni-erlangen.de/pub/
mirrors/gentoo ftp://ftp.uni-erlangen.de/pub/mirrors/gentoo"
MAKEOPTS="-j3"
PKGDIR="/usr/portage/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/usr/local/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="X acpi alsa apache2 apm arts avi berkdb cdr crypt cups doc dvd encode esd 
foomaticdb gdbm gif gnome gpm gstreamer gtk gtk2 gtkhtml guile imlib java jpeg 
libg++ libwww linguas_de lufsusermount mad mikmod mmx motif mozilla mozilla-
firebird mpeg ncurses nls nptl oggvorbis opengl oss pam pdf pdflib perl php png 
python qt quicktime readline samba sdl slang spell sse ssl svga tcltk tcpd tetex 
threads tiff truetype unicode usb vim vim-with-x x86 xml2 xmms xv zlib"
Comment 1 Heinrich Wendel (RETIRED) gentoo-dev 2004-09-21 06:34:07 UTC
i think, portage should set LANG=C by default.
Comment 2 Heinrich Wendel (RETIRED) gentoo-dev 2004-09-21 06:37:17 UTC
*** Bug 41947 has been marked as a duplicate of this bug. ***
Comment 3 Heinrich Wendel (RETIRED) gentoo-dev 2004-09-21 07:07:52 UTC
*** Bug 9901 has been marked as a duplicate of this bug. ***
Comment 4 Mamoru KOMACHI (RETIRED) gentoo-dev 2004-09-21 08:10:03 UTC
If so, there should be a way to turn on nls support. I set LANG=C personally
but some may want messages in ja_JP.UTF-8. I think broken packages
regarding to LANG should be fixed by package/ebuild side (not by portage side).
Comment 5 Martin Probst 2004-09-21 09:00:14 UTC
The problem is with the sourcecode. Lots of developers (especially French guys) write theire whole source in plain ASCII except the "@author" tags (or whatever corresponding).

Because most of them seem to use ISO-8859-15 an UTF-8 system (and UTF-8 java) bail out because of an illegal char in the input (which is generally correct). This makes it plain impossible to maintain a system with UTF-8 locale as _lots_ of packets break, especially the java ones.

What you're supposing would mean lots of patches for all these java-packages - either to convert to UTF-8 or just to remove the offending lines. That would be a lot of work compared to a simple LANG=C.

A compromise might be to set these variables just (and only) before calling make/ant or all steps that really touch the source files.
Comment 6 Mamoru KOMACHI (RETIRED) gentoo-dev 2004-09-21 09:28:57 UTC
If it's a problem with source code, please take it to its upstream (That's our policy).
What I was saying is, if setting LANG to C is enough it should not be done in portage
but an ebuild. Most of our ebuilds are not broken wrt LANG, so why should we force
users to see English message instead of localised one? If portage is just a blackbox
to install packages that's fine but I think portage is also to aid people build packages.
Some users would have benefit from localised error messages and could figure out
what the problem is, and so I object if portage forces LANG=C and takes away the
opportunity to set it to one's native language.
Comment 7 Andres Järv 2004-12-22 06:44:52 UTC
So basically ISO character sets and program developers are to blame?
Comment 8 SpanKY gentoo-dev 2004-12-22 06:59:26 UTC
developers who assume the regex [a-zA-Z] matches all alpha characters are to blame
Comment 9 SpanKY gentoo-dev 2005-02-04 06:35:20 UTC
packages should build sanely regardless of LANG
Comment 10 Zac Medico gentoo-dev 2006-05-19 11:36:56 UTC
*** Bug 133758 has been marked as a duplicate of this bug. ***