What I'm about to report is not a bug in unrar, but a design bug in rar itself. I'm not reporting this so it gets fixed, but to inform people about it (so please, don't close it as upstream/invalid/etc.). It seems, that when in Windows (XP family) you create a directory structure that has filenames outside your ANSI codepage, rar will store those filenames correctly only as non-Unicode names, the Unicode names will get corrupted. When unrar tries to unpack such archive, it will try to use Unicode values, so the filenames will be incorrect and irreversibly broken (at least for the moment I think it's irreversible). This was observed while trying to unpack an archive with Japanese names. To fix it, an option should be added to unrar, that makes it ignore Unicode names, so the non-Unicode ones are used and convmv can be used to fix the problem.
Bug reports are not supposed to eternally inform people, so before I close this bug report as UPSTREAM: Does app-arch/unrar-gpl exhibit the same issue?
Yes, it does. It's even worse, cause instead of extracting with invalid filenames, it simply fails to extract those files at all. But as I said, this is a design flaw of rar itself, not a bug in unrar. rar (in Windows at least), stores Unicode filenames using some algorithm based on system ANSI codepage, that algorithm leads to incorrect results if filenames are outside that ANSI codepage. What I'm talking about here is a hack around that problem. As rar probably won't be fixed and the archive already exists, a flag to ignore stored unicode names would at least allow to fix the problem using convmv. I'd like this bug to stay open, so that somebody having more push could stumble upon it and try to talk to rar developers about it (I mailed them about it a few months ago, but I don't think they understood what the problem is, or simply decided it's to rare to be fixed, after all, rar is a commercial product).
.
*** This bug has been marked as a duplicate of bug 172430 ***