Emerging the same package with locales set to ISO, or UTF does not end up to identical files. For some packages, the files contain harcoded env sensible words and letters. And, stupidly, emerge complains about it !!! example: LANG=en_GB.ISO-8859-15 LC_ALL=en_GB.ISO-8859-15 emerge -v1 sci-electronics/geda will produce this message: /usr/share/mime/packages/libgeda.xml:6: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE9 0x6D 0x61 0x20 <comment xml:lang="fr">Schéma de circuit gEDA</comment> ^ Failed to parse '//usr/share/mime/packages/libgeda.xml' See bug 326205 for specific details of this case. Generaly speaking, I think portage is just too sensible to env. There have been many examples in the past; bug 253467 was one of them ( set << A=foo emerge whatever >> => unpack failure). An other one is still open: bug 95259 (use of \t in einfo and ewarn produce inapropriate output in many cases). The way portage manipulates variables needs to be reviewed deeply. It's at least the 5th problem of the same kind I have, and, many people joined my bugs. I don't think anymore that the problem is just that portage should sanitize one or two env variables; I tend to thing that portage should purge compleetely env before starting work, in order to take full control on *all* variables.
I reopened #326205; that's a bug in geda. All packages should ideally build with all valid locale settings. As for this bug, completely clearing the environment is not generally the correct thing to do, but re-assinging anyway in case this is easily implementable as a configurable option.
Same issue here: * Updating shared mime info database ... /usr/share/mime/packages/libgeda.xml:6: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE9 0x6D 0x61 0x20 <comment xml:lang="fr">Schéma de circuit gEDA</comment> ^ Failed to parse '//usr/share/mime/packages/libgeda.xml' grep LINGUAS /etc/make.conf LINGUAS="en fr en_GB fr_FR" BTW then accented character shows as a back-to-front N in xterm window.
(In reply to comment #2) > * Updating shared mime info database ... > /usr/share/mime/packages/libgeda.xml:6: parser error : Input is not proper > UTF-8, indicate encoding ! You'd probably be more interested in bug 326205 (which refers to sci-electronics/geda), since the current bug that we're posting on (bug 326887) is a more general complaint about sys-apps/portage.
i think we've committed to not clearing user's locale preferences. see also the discussions on the gentoo-dev mailing list wrt LC_MESSAGES.