Using the recode-3.6 it can't handle utf-8 files. If you try to recode a file (assuming an xml-document of gentoo.org) to latin1 it fails on the first utf-8 character with the error recode: Invalid input in step `UTF-8..ISO-8859-1' Step to reproduce: recode utf-8..latin1 < gentoo-x86-install.xml (of the german doc-tree) DarkSpecter has also this problem. On x86 this works without problems.
I looked into the sources of recode (especially utf8.c) and assume, that the copy process of the utf-8 characters are in little-endian. Can somebody with a good C knowledge look into that file?
*** Bug 20139 has been marked as a duplicate of this bug. ***
Bug 20027 seems to have the solution to this problem. Attachment 11212 [details, diff] is a patch pulled from Debian (http://packages.debian.org/stable/text/recode.html), and attachment 11211 [details] is the ebuild which makes use of it. Now with this patch, recode no longer borks on the example which Lars originally wrote about. However, I don't know if the output is _correct_. :) Anyone?
Thanks for pointing to this patch. This does really resolve the utf8-problem :-) So, I commited a new recode-3.6-r1 to portage and masked it ppc.