Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 69945 - unzip extract broken filename.
Summary: unzip extract broken filename.
Status: RESOLVED UPSTREAM
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: Current packages (show other bugs)
Hardware: All Linux
: High normal
Assignee: Gentoo's Team for Core System packages
URL:
Whiteboard:
Keywords:
: 204257 (view as bug list)
Depends on:
Blocks:
 
Reported: 2004-11-03 08:01 UTC by Young-Ho Cha
Modified: 2008-01-04 10:12 UTC (History)
4 users (show)

See Also:
Package list:
Runtime testing required: ---


Attachments
modified for x86_64 (unzip-5.52-locale.patch,6.69 KB, patch)
2007-08-11 07:11 UTC, Young-deuk Hong
Details | Diff
add epatch for local patch (unzip-5.52-r1.ebuild,1.68 KB, text/plain)
2007-08-11 07:12 UTC, Young-deuk Hong
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Young-Ho Cha 2004-11-03 08:01:40 UTC
when unzip extract filename , unzip handle with 7 bit filename.
so filenames with non-latin1 characters are broken.

and an zip archive from M$ Windows or other OSes with non utf-8 locale, extracts broken filenames too.

so, i make a little patch that detect file charset from locale.

after apply this patch, file-roller(gnome archiving tool) can show korean filenames. 

i think it works in other locale too.


Reproducible: Always
Steps to Reproduce:
1. extract with unzip from zip archive

Actual Results:  
ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip
Archive:  /home/ganadist/Documents/hhs_img.zip
  inflating: 23++�-+�-�_�+��-����-
Comment 1 Young-Ho Cha 2004-11-03 08:01:40 UTC
when unzip extract filename , unzip handle with 7 bit filename.
so filenames with non-latin1 characters are broken.

and an zip archive from M$ Windows or other OSes with non utf-8 locale, extracts broken filenames too.

so, i make a little patch that detect file charset from locale.

after apply this patch, file-roller(gnome archiving tool) can show korean filenames. 

i think it works in other locale too.


Reproducible: Always
Steps to Reproduce:
1. extract with unzip from zip archive

Actual Results:  
ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip
Archive:  /home/ganadist/Documents/hhs_img.zip
  inflating: 23++�-+�-�_�+��-����-ø��+�-�-�.jpg
  inflating: pop_++�-+�-�-�.png
  inflating: pop_-���-�+-.png


Expected Results:  
ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip
Archive:  /home/ganadist/Documents/hhs_img.zip
  inflating: 23체력측정_심폐지구력혈압측정중.jpg
  inflating: pop_체력측정중.png
  inflating: pop_카드접촉.png

screenshot:
http://ftp.mizi.com/~ganadist/file-roller-broken.png
left picture is run with patched unzip, and right picture is unpatched unzip.
patch:
http://ftp.mizi.com/~ganadist/unzip-locale.diff
Comment 2 Young-Ho Cha 2004-11-03 08:06:05 UTC
ah.. "Results" reports are broken :(

i take screenshot.

before patch:
http://ftp.mizi.com/~ganadist/unzip-unpatched.png

after patch:
http://ftp.mizi.com/~ganadist/unzip-patched.png
Comment 3 Gregorio Guidi (RETIRED) gentoo-dev 2004-11-03 08:20:16 UTC
nice, have you submitted it to unzip authors?

ftp://ftp.info-zip.org/pub/infozip/FAQ.html#zip-bugs
http://www.info-zip.org/zip-bug.html
Comment 4 Young-Ho Cha 2004-11-04 22:26:00 UTC
I reported zip-bug form, and recieved answer.

----
Thank you!  We currently don't have a full-time UnZip maintainer, but
I have saved your patch and screenshots in my 6.0-patch-collection
directory, so at least they won't be lost.  (No clue when 6.0 might
be released, but probably not before the middle of next year.)

Comment 5 Alexander Simonov 2004-11-29 13:07:43 UTC
+if(!strncmp(lang, "ru", 2)) return "KOI8-R";
+if(!strncmp(lang, "uk", 2)) return "KOI8-U";
This strings is broken.
If russian locale is ru_RU.UTF8?
I seeing you patch and correct for cyrilic unicode locale
Comment 6 Alexander Simonov 2004-11-29 13:25:25 UTC
Oh!!! 
Sorry!!!
But if russian codepage is cp1251 ?
Comment 7 Heinrich Wendel (RETIRED) gentoo-dev 2005-01-11 06:36:03 UTC
any update on the patch so i can include it?
Comment 8 Young-Ho Cha 2005-01-13 06:43:49 UTC
updated russian's charset to CP1251 from KOI8-R.

can get from same url :)
Comment 9 SpanKY gentoo-dev 2005-08-15 21:12:57 UTC
could you please update it for 5.52 ?
Comment 10 Young-Ho Cha 2006-10-07 22:18:26 UTC
(In reply to comment #8)
> could you please update it for 5.52 ?
> 

updated patch for 5.52-r1 ebuild at same url.
Comment 11 SpanKY gentoo-dev 2006-11-10 23:59:50 UTC
i dont think that's quite how you want to do it ... i'm pretty sure you want to change the Ext_ASCII_TO_Native() macro instead of that "#if 0" stuff

also, unless i read the patch wrong, basing the zipfile input on $LANG doesnt make any sense ...
Comment 12 David Chang 2007-06-13 17:30:54 UTC
here's a working patch

https://bugzilla.altlinux.org/attachment.cgi?id=1402
Comment 13 Young-deuk Hong 2007-08-11 07:10:36 UTC
if machine is x86_64, this patch will not work.

in x86_64(CHOST="x86_64-pc-linux-gnu"), ebuild sets TARGET to linux_noasm.

i have posted updated patch.
Comment 14 Young-deuk Hong 2007-08-11 07:11:56 UTC
Created attachment 127717 [details, diff]
modified for x86_64
Comment 15 Young-deuk Hong 2007-08-11 07:12:55 UTC
Created attachment 127718 [details]
add epatch for local patch
Comment 16 Jakub Moc (RETIRED) gentoo-dev 2008-01-04 10:12:23 UTC
*** Bug 204257 has been marked as a duplicate of this bug. ***