Flip through the glep- global vars (ie, metadata) must be ascii 0-127, forcing flat_hash to encode everything to disk as utf8 thus is seriously invalid- the attempted encoding should just plain be removed. Reasons are pretty straightforward- encoding isn't something you set just for write, you need to maintain it end to end- both writing *and* reading. This means literally that the initial load of the metadata from a regen needs to be read in as unicode (it's not being done so), else you'll get chars 0xf set (for multi byte unicode glyphs)- if you read it in as ascii, you cannot easily convert it to unicode at the marshalling stage as you're trying. Beyond that, the change doesn't trap encoding failures (games-sports/miniracer's (R) glyph being easy one to trigger). Doing unicode is a good thing, but it's not a simple change- ie, not something for 2.1 imo, regardless, glep31 already lays out that the metadata keys *must* be ascii 0-127 (thus ruling out utf8), so the flat_hash change in rev 3328 has to be backed out.
I've reverted it in r3349. We'll have to add a check to repoman to make sure that all metadata is plain ascii...
released in 2.1_rc1-r1