234187 – app-arch/unrar: problem with non-ascii names in the archive

Bug 234187 - app-arch/unrar: problem with non-ascii names in the archive

Summary: app-arch/unrar: problem with non-ascii names in the archive

Status:	RESOLVED DUPLICATE of bug 172430

Alias:	None

Product:	Gentoo Linux
Classification:	Unclassified
Component:	Current packages (show other bugs)
Hardware:	All Linux

Importance:	High normal (vote)
Assignee:	Gentoo Linux bug wranglers

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-08-07 15:47 UTC by Rafał Mużyło
Modified:	2008-09-23 12:09 UTC (History)
CC List:	0 users

See Also:
Package list:
Runtime testing required:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Rafał Mużyło 2008-08-07 15:47:25 UTC

What I'm about to report is not a bug in unrar, 
but a design bug in rar itself.
I'm not reporting this so it gets fixed, but to inform people about it
(so please, don't close it as upstream/invalid/etc.).

It seems, that when in Windows (XP family) you create a directory structure
that has filenames outside your ANSI codepage, rar will store those filenames
correctly only as non-Unicode names, the Unicode names will get corrupted.
When unrar tries to unpack such archive, it will try to use Unicode values,
so the filenames will be incorrect and irreversibly broken (at least
for the moment I think it's irreversible).

This was observed while trying to unpack an archive with Japanese names.

To fix it, an option should be added to unrar, that makes it ignore
Unicode names, so the non-Unicode ones are used and convmv can be used
to fix the problem.

Comment 1 Jeroen Roovers (RETIRED) gentoo-dev

2008-08-07 18:06:50 UTC

Bug reports are not supposed to eternally inform people, so before I close this bug report as UPSTREAM: Does app-arch/unrar-gpl exhibit the same issue?

Comment 2 Rafał Mużyło 2008-08-07 19:46:49 UTC

Yes, it does. It's even worse, cause instead of extracting with
invalid filenames, it simply fails to extract those files at all.

But as I said, this is a design flaw of rar itself,
not a bug in unrar. rar (in Windows at least), stores Unicode filenames
using some algorithm based on system ANSI codepage, that algorithm leads to 
incorrect results if filenames are outside that ANSI codepage.

What I'm talking about here is a hack around that problem.
As rar probably won't be fixed and the archive already exists,
a flag to ignore stored unicode names would at least allow 
to fix the problem using convmv.

I'd like this bug to stay open, so that somebody having more
push could stumble upon it and try to talk to rar developers
about it (I mailed them about it a few months ago,
but I don't think they understood what the problem is,
or simply decided it's to rare to be fixed, after all,
rar is a commercial product).

Comment 3 Jeroen Roovers (RETIRED) gentoo-dev

2008-09-23 12:08:58 UTC

Comment 4 Jeroen Roovers (RETIRED) gentoo-dev

2008-09-23 12:09:05 UTC


*** This bug has been marked as a duplicate of bug 172430 ***