Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!

Bug 69945

Summary: unzip extract broken filename.
Product: Gentoo Linux Reporter: Young-Ho Cha <ganadist>
Component: Current packagesAssignee: Gentoo's Team for Core System packages <base-system>
Status: RESOLVED UPSTREAM    
Severity: normal CC: flash3001, greg_g, utf8, wiono
Priority: High    
Version: unspecified   
Hardware: All   
OS: Linux   
Whiteboard:
Package list:
Runtime testing required: ---
Attachments: modified for x86_64
add epatch for local patch

Description Young-Ho Cha 2004-11-03 08:01:40 UTC
when unzip extract filename , unzip handle with 7 bit filename.
so filenames with non-latin1 characters are broken.

and an zip archive from M$ Windows or other OSes with non utf-8 locale, extracts broken filenames too.

so, i make a little patch that detect file charset from locale.

after apply this patch, file-roller(gnome archiving tool) can show korean filenames. 

i think it works in other locale too.


Reproducible: Always
Steps to Reproduce:
1. extract with unzip from zip archive

Actual Results:  
ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip
Archive:  /home/ganadist/Documents/hhs_img.zip
  inflating: 23++&#65533;-+&#65533;-&#65533;_&#65533;+&#65533;&#65533;-&#65533;&#65533;&#65533;&#65533;-
Comment 1 Young-Ho Cha 2004-11-03 08:01:40 UTC
when unzip extract filename , unzip handle with 7 bit filename.
so filenames with non-latin1 characters are broken.

and an zip archive from M$ Windows or other OSes with non utf-8 locale, extracts broken filenames too.

so, i make a little patch that detect file charset from locale.

after apply this patch, file-roller(gnome archiving tool) can show korean filenames. 

i think it works in other locale too.


Reproducible: Always
Steps to Reproduce:
1. extract with unzip from zip archive

Actual Results:  
ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip
Archive:  /home/ganadist/Documents/hhs_img.zip
  inflating: 23++&#65533;-+&#65533;-&#65533;_&#65533;+&#65533;&#65533;-&#65533;&#65533;&#65533;&#65533;-ΓΈ&#65533;&#65533;+&#65533;-&#65533;-&#65533;.jpg
  inflating: pop_++&#65533;-+&#65533;-&#65533;-&#65533;.png
  inflating: pop_-&#65533;&#65533;&#65533;-&#65533;+-.png


Expected Results:  
ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip
Archive:  /home/ganadist/Documents/hhs_img.zip
  inflating: 23&#52404;&#47141;&#52769;&#51221;_&#49900;&#54224;&#51648;&#44396;&#47141;&#54792;&#50517;&#52769;&#51221;&#51473;.jpg
  inflating: pop_&#52404;&#47141;&#52769;&#51221;&#51473;.png
  inflating: pop_&#52852;&#46300;&#51217;&#52489;.png

screenshot:
http://ftp.mizi.com/~ganadist/file-roller-broken.png
left picture is run with patched unzip, and right picture is unpatched unzip.
patch:
http://ftp.mizi.com/~ganadist/unzip-locale.diff
Comment 2 Young-Ho Cha 2004-11-03 08:06:05 UTC
ah.. "Results" reports are broken :(

i take screenshot.

before patch:
http://ftp.mizi.com/~ganadist/unzip-unpatched.png

after patch:
http://ftp.mizi.com/~ganadist/unzip-patched.png
Comment 3 Gregorio Guidi (RETIRED) gentoo-dev 2004-11-03 08:20:16 UTC
nice, have you submitted it to unzip authors?

ftp://ftp.info-zip.org/pub/infozip/FAQ.html#zip-bugs
http://www.info-zip.org/zip-bug.html
Comment 4 Young-Ho Cha 2004-11-04 22:26:00 UTC
I reported zip-bug form, and recieved answer.

----
Thank you!  We currently don't have a full-time UnZip maintainer, but
I have saved your patch and screenshots in my 6.0-patch-collection
directory, so at least they won't be lost.  (No clue when 6.0 might
be released, but probably not before the middle of next year.)

Comment 5 Alexander Simonov 2004-11-29 13:07:43 UTC
+if(!strncmp(lang, "ru", 2)) return "KOI8-R";
+if(!strncmp(lang, "uk", 2)) return "KOI8-U";
This strings is broken.
If russian locale is ru_RU.UTF8?
I seeing you patch and correct for cyrilic unicode locale
Comment 6 Alexander Simonov 2004-11-29 13:25:25 UTC
Oh!!! 
Sorry!!!
But if russian codepage is cp1251 ?
Comment 7 Heinrich Wendel (RETIRED) gentoo-dev 2005-01-11 06:36:03 UTC
any update on the patch so i can include it?
Comment 8 Young-Ho Cha 2005-01-13 06:43:49 UTC
updated russian's charset to CP1251 from KOI8-R.

can get from same url :)
Comment 9 SpanKY gentoo-dev 2005-08-15 21:12:57 UTC
could you please update it for 5.52 ?
Comment 10 Young-Ho Cha 2006-10-07 22:18:26 UTC
(In reply to comment #8)
> could you please update it for 5.52 ?
> 

updated patch for 5.52-r1 ebuild at same url.
Comment 11 SpanKY gentoo-dev 2006-11-10 23:59:50 UTC
i dont think that's quite how you want to do it ... i'm pretty sure you want to change the Ext_ASCII_TO_Native() macro instead of that "#if 0" stuff

also, unless i read the patch wrong, basing the zipfile input on $LANG doesnt make any sense ...
Comment 12 David Chang 2007-06-13 17:30:54 UTC
here's a working patch

https://bugzilla.altlinux.org/attachment.cgi?id=1402
Comment 13 Young-deuk Hong 2007-08-11 07:10:36 UTC
if machine is x86_64, this patch will not work.

in x86_64(CHOST="x86_64-pc-linux-gnu"), ebuild sets TARGET to linux_noasm.

i have posted updated patch.
Comment 14 Young-deuk Hong 2007-08-11 07:11:56 UTC
Created attachment 127717 [details, diff]
modified for x86_64
Comment 15 Young-deuk Hong 2007-08-11 07:12:55 UTC
Created attachment 127718 [details]
add epatch for local patch
Comment 16 Jakub Moc (RETIRED) gentoo-dev 2008-01-04 10:12:23 UTC
*** Bug 204257 has been marked as a duplicate of this bug. ***