Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 53912 - rss feed is not utf-8, but claims to be
Summary: rss feed is not utf-8, but claims to be
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Infrastructure
Classification: Unclassified
Component: [OLD] gpackages (show other bugs)
Hardware: All All
: High minor (vote)
Assignee: Albert Hopkins (RETIRED)
URL: http://packages.gentoo.org/archs/x86/...
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-06-14 12:23 UTC by Marien Zwart (RETIRED)
Modified: 2011-11-11 15:59 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Marien Zwart (RETIRED) gentoo-dev 2004-06-14 12:23:53 UTC
The rss reader liferea complains about the x86 fresh ebuilds rss feed. It seems that some names, like "Bryan <D8>stergaard" (sorry, I'm having difficulties pasting this character) are not valid utf-8, but iso-8859-1. These names are part of the changelog entries. The rss starts with "<?xml version="1.0" encoding="utf-8"?>", so it should be using only utf-8.

My news reader can still show the feed, so I don't consider this an urgent problem. But since it means the rss file is technically not valid xml, I think it should be solved if possible.

Reproducible: Sometimes
Steps to Reproduce:
1. wget http://packages.gentoo.org/archs/x86/gentoo.rss
2. look at it manually, select a piece containing non-ascii characters and feed it to "file"
3. alternatively, feed the entire thing to an xml parser, like the one in the liferea rss reader

Actual Results:  
The tools reported the file is not valid utf-8, but iso-8859-1 instead.

Expected Results:  
The file should be valid utf-8.

Since this is a website bug, emerge info doesn't seem relevant.
Comment 1 Kurt Lieber (RETIRED) gentoo-dev 2004-06-14 13:14:24 UTC
marduk?
Comment 2 Albert Hopkins (RETIRED) gentoo-dev 2004-06-19 08:58:58 UTC
Changed charset type to iso-8859-1.  This hopefully fixes things.