Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 410825 - repositories.xml - wrongly encoded accented characters in maintainer names
Summary: repositories.xml - wrongly encoded accented characters in maintainer names
Status: RESOLVED FIXED
Alias: None
Product: Portage Development
Classification: Unclassified
Component: Tools (show other bugs)
Hardware: All Linux
: Normal trivial (vote)
Assignee: Brian Dolbec
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-04 22:20 UTC by Reinis Danne
Modified: 2012-04-16 23:05 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Reinis Danne 2012-04-04 22:20:24 UTC
Looks like the name of the maintainer is not in utf-8, since it displayed with like this:
Tomáš Chvátal


Reproducible: Always

Steps to Reproduce:
1. layman -i gamerlay
2. or vim /var/lib/layman/installed.xml

Actual Results:  
 * Contact : Tomáš Chvátal <scarabeus@gentoo.org>


Expected Results:  
Proper accented characters in the name.


Portage 2.1.10.55 (default/linux/amd64/10.0/desktop/gnome, gcc-4.6.2, glibc-2.14.1-r2, 3.2.12-gentoo x86_64)
=================================================================
                        System Settings
=================================================================
System uname: Linux-3.2.12-gentoo-x86_64-Intel-R-_Core-TM-_i7-2630QM_CPU_@_2.00GHz-with-gentoo-2.1
Timestamp of tree: Wed, 04 Apr 2012 20:45:01 +0000
app-shells/bash:          4.2_p24
dev-java/java-config:     2.1.11-r3
dev-lang/python:          2.6.7-r2, 2.7.2-r3, 3.1.4-r4, 3.2.2-r1
dev-util/cmake:           2.8.7-r5
dev-util/pkgconfig:       0.26
sys-apps/baselayout:      2.1
sys-apps/openrc:          0.9.9.3
sys-apps/sandbox:         2.5
sys-devel/autoconf:       2.13, 2.68
sys-devel/automake:       1.9.6-r3, 1.10.3, 1.11.4
sys-devel/binutils:       2.22-r1
sys-devel/gcc:            4.4.7, 4.5.3-r2, 4.6.2
sys-devel/gcc-config:     1.6
sys-devel/libtool:        2.4.2
sys-devel/make:           3.82-r3
sys-kernel/linux-headers: 3.3 (virtual/os-headers)
sys-libs/glibc:           2.14.1-r2
Repositories: gentoo x11 science gamerlay-stable bumblebee local
ACCEPT_KEYWORDS="amd64 ~amd64"
ACCEPT_LICENSE="* -@EULA"
CBUILD="x86_64-pc-linux-gnu"
CFLAGS="-march=native -mtune=native -O3 -pipe -ggdb"
CHOST="x86_64-pc-linux-gnu"
CONFIG_PROTECT="/etc /usr/share/gnupg/qualified.txt /var/lib/hsqldb"
CONFIG_PROTECT_MASK="/etc/ca-certificates.conf /etc/dconf /etc/env.d /etc/env.d/java/ /etc/fonts/fonts.conf /etc/gconf /etc/gentoo-release /etc/revdep-rebuild /etc/sandbox.d /etc/splash /etc/terminfo /etc/texmf/language.dat.d /etc/texmf/language.def.d /etc/texmf/updmap.d /etc/texmf/web2c"
CXXFLAGS="-march=native -mtune=native -O3 -pipe -ggdb"
DISTDIR="/usr/portage/distfiles"
EMERGE_DEFAULT_OPTS=""
FEATURES="assume-digests binpkg-logs compress-build-logs distlocks ebuild-locks fixlafiles news parallel-fetch parallel-install protect-owned sandbox sfperms splitdebug strict unknown-features-warn unmerge-logs unmerge-orphans userfetch userpriv usersandbox usersync"
FFLAGS="-march=native -mtune=native -O3 -pipe -ggdb"
GENTOO_MIRRORS="ftp://trumpetti.atm.tut.fi/gentoo/ http://trumpetti.atm.tut.fi/gentoo/ http://gentoo.tups.lv/source/ "
LANG="lv_LV.UTF-8"
LC_ALL="lv_LV.UTF-8"
LDFLAGS="-Wl,-O1 -Wl,--as-needed"
LINGUAS="lv en"
MAKEOPTS="-j9"
PKGDIR="/usr/portage/packages"
PORTAGE_CONFIGROOT="/"
PORTAGE_RSYNC_OPTS="--recursive --links --safe-links --perms --times --compress --force --whole-file --delete --stats --human-readable --timeout=180 --exclude=/distfiles --exclude=/local --exclude=/packages"
PORTAGE_TMPDIR="/var/tmp"
PORTDIR="/usr/portage"
PORTDIR_OVERLAY="/var/lib/layman/x11 /var/lib/layman/science /var/lib/layman/gamerlay /var/lib/layman/bumblebee /usr/local/portage"
SYNC="rsync://rsync.europe.gentoo.org/gentoo-portage"
USE="X a52 aac acl acpi alsa amd64 avx bash-completion berkdb bluetooth branding bzip2 cairo cdda cdio cdr cjk cleartype cli colord consolekit cracklib crypt cups cxx dbus dirac djvu dri dts dvd dvdr eds emboss encode evo exif fam ffmpeg fftw firefox flac fontconfig fortran gdbm gdu gif gnome gnome-keyring gnome-online-accounts gphoto2 gpm gsm gstreamer gtk gtk3 iconv idn ipv6 jpeg kate lcms ldap libcaca libnotify live mad matroska mmx mng modules mp3 mp4 mpeg mtp mudflap multilib musepack nautilus ncurses networkmanager nls nptl nptlonly ogg openexr opengl openmp pam pango pcre pdf png policykit ppds pppd pulseaudio qt3support qt4 raw readline schroedinger sdl session smp socialweb speex spell sse sse2 sse4_1 ssl ssse3 startup-notification svg sysfs system-sqlite tcpd theora tiff truetype udev unicode usb v4l v4l2 vaapi vorbis vpx wmf x264 xcb xetex xml xmp xorg xpm xulrunner xv xvid xvmc zlib" ALSA_CARDS="ali5451 als4000 atiixp atiixp-modem bt87x ca0106 cmipci emu10k1x ens1370 ens1371 es1938 es1968 fm801 hda-intel intel8x0 intel8x0m maestro3 trident usb-audio via82xx via82xx-modem ymfpci" ALSA_PCM_PLUGINS="adpcm alaw asym copy dmix dshare dsnoop empty extplug file hooks iec958 ioplug ladspa lfloat linear meter mmap_emul mulaw multi null plug rate route share shm softvol" APACHE2_MODULES="actions alias auth_basic authn_alias authn_anon authn_dbm authn_default authn_file authz_dbm authz_default authz_groupfile authz_host authz_owner authz_user autoindex cache cgi cgid dav dav_fs dav_lock deflate dir disk_cache env expires ext_filter file_cache filter headers include info log_config logio mem_cache mime mime_magic negotiation rewrite setenvif speling status unique_id userdir usertrack vhost_alias" CALLIGRA_FEATURES="kexi words flow plan sheets stage tables krita karbon braindump" CAMERAS="ptp2" COLLECTD_PLUGINS="df interface irq load memory rrdtool swap syslog" ELIBC="glibc" FOO2ZJS_DEVICES="hp1018" GPSD_PROTOCOLS="ashtech aivdm earthmate evermore fv18 garmin garmintxt gpsclock itrax mtk3301 nmea ntrip navcom oceanserver oldstyle oncore rtcm104v2 rtcm104v3 sirf superstar2 timing tsip tripmate tnt ubx" INPUT_DEVICES="evdev synaptics" KERNEL="linux" LCD_DEVICES="bayrad cfontz cfontz633 glk hd44780 lb216 lcdm001 mtxorb ncurses text" LINGUAS="lv en" PHP_TARGETS="php5-3" RUBY_TARGETS="ruby18" USERLAND="GNU" VIDEO_CARDS="dummy fbdev nvidia i965 intel vesa" XTABLES_ADDONS="quota2 psd pknock lscan length2 ipv4options ipset ipp2p iface geoip fuzzy condition tee tarpit sysrq steal rawnat logmark ipmark dhcpmac delude chaos account"
Unset:  CPPFLAGS, CTARGET, INSTALL_MASK, PORTAGE_BUNZIP2_COMMAND, PORTAGE_COMPRESS, PORTAGE_COMPRESS_FLAGS, PORTAGE_RSYNC_EXTRA_OPTS, USE_PYTHON

=================================================================
                        Package Settings
=================================================================

app-portage/layman-2.0.0_rc3 was built with the following:
USE="git (multilib) -bazaar -cvs -darcs -mercurial -subversion -test"
CFLAGS="-march=native -O3 -pipe -ggdb"
CXXFLAGS="-march=native -O3 -pipe -ggdb"
Comment 1 Jeroen Roovers (RETIRED) gentoo-dev 2012-04-04 23:59:56 UTC
That's a problem in layman, I guess, not in an overlay or its description.
Comment 2 Jeroen Roovers (RETIRED) gentoo-dev 2012-04-05 00:05:02 UTC
In fact in repositories.xml, the name is presented in perfectly valid HTML entities.
Comment 3 Reinis Danne 2012-04-05 00:48:31 UTC
I'm not sure what you mean by valid html entities, but looking at http://www.gentoo.org/proj/en/overlays/repositories.xml in firefox and doing wget on the link and looking in the file with vim I'm seeing this:
<name>Tomáš Chvátal</name>
and this
<name><![CDATA[Tomáš Chvátal]]></name>

In the same time for other repositories I see proper accented characters (I haven't checked many of them), e.g. for aidecoe:
<name>Amadeusz Żołnowski</name>
and this
<name><![CDATA[Amadeusz Żołnowski]]></name>

It might be valid html, but it doesn't look like properly encoded utf-8 string in the first case to me. So I think that the text was just provided in some non-utf-8 encoding for that particular overlay.

For the reference, afaik the correct string would be:
Tomáš Chvátal
Comment 4 Brian Dolbec (RETIRED) gentoo-dev 2012-04-16 01:44:29 UTC
I've corrected the gamerlay overlay, but searching for "&" I've found several more with improperly encoded strings.

fixed: gamerlay, flameeyes-overlay, fordfrog

Still require fixing:
dMaggot, wdzierzan, nektoo
Comment 5 Brian Dolbec (RETIRED) gentoo-dev 2012-04-16 02:17:13 UTC
cc'ing overlay maintainers that have incorrectly accented names entered in repositories.xml.

David, Wojciech  can you please supply us with your properly accented names encoded in utf-8 so we can correct the entries in layman's overlay listings.
Comment 6 David E. Narváez 2012-04-16 05:03:28 UTC
(In reply to comment #5)
> David, Wojciech  can you please supply us with your properly accented names
> encoded in utf-8 so we can correct the entries in layman's overlay listings.

There you go, that's U+00E1, UTF-8 c3a1.

David E. Narváez
Comment 7 Wojciech Dzierżanowski 2012-04-16 18:22:06 UTC
(In reply to comment #5)
> cc'ing overlay maintainers that have incorrectly accented names entered in
> repositories.xml.
> 
> David, Wojciech  can you please supply us with your properly accented names
> encoded in utf-8 so we can correct the entries in layman's overlay listings.

That's Wojciech Dzierżanowski.

I presume it's just about encoding problems with repositories.xml and no further action is required of me.  This http://git.overlays.gentoo.org/gitweb/?p=user/wdzierzan.git;a=summary looks fine, and my e-mail client also claims it sent the overlay request message to overlays at gentoo dot org in UTF-8 encoding.
Comment 8 Brian Dolbec (RETIRED) gentoo-dev 2012-04-16 23:05:43 UTC
Thank you.

That is correct, no further action is needed by you.  I've fixed and committed the last two corrected accented names to repositories.xml.  So your names will appear correctly now in layman -i output once you have the updated list.

It was most likely caused by someone committing changes whose editor was not correctly set to utf-8, so as a result there were several names that were incorrect in utf-8.

Closing as I found no other apparent incorrect names with "&#...;" embedded in them.