when unzip extract filename , unzip handle with 7 bit filename. so filenames with non-latin1 characters are broken. and an zip archive from M$ Windows or other OSes with non utf-8 locale, extracts broken filenames too. so, i make a little patch that detect file charset from locale. after apply this patch, file-roller(gnome archiving tool) can show korean filenames. i think it works in other locale too. Reproducible: Always Steps to Reproduce: 1. extract with unzip from zip archive Actual Results: ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip Archive: /home/ganadist/Documents/hhs_img.zip inflating: 23++�-+�-�_�+��-����-
when unzip extract filename , unzip handle with 7 bit filename. so filenames with non-latin1 characters are broken. and an zip archive from M$ Windows or other OSes with non utf-8 locale, extracts broken filenames too. so, i make a little patch that detect file charset from locale. after apply this patch, file-roller(gnome archiving tool) can show korean filenames. i think it works in other locale too. Reproducible: Always Steps to Reproduce: 1. extract with unzip from zip archive Actual Results: ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip Archive: /home/ganadist/Documents/hhs_img.zip inflating: 23++�-+�-�_�+��-����-ΓΈ��+�-�-�.jpg inflating: pop_++�-+�-�-�.png inflating: pop_-���-�+-.png Expected Results: ganadist@ganadist /tmp $ unzip -x /home/ganadist/Documents/hhs_img.zip Archive: /home/ganadist/Documents/hhs_img.zip inflating: 23체력측정_심폐지구력혈압측정중.jpg inflating: pop_체력측정중.png inflating: pop_카드접촉.png screenshot: http://ftp.mizi.com/~ganadist/file-roller-broken.png left picture is run with patched unzip, and right picture is unpatched unzip. patch: http://ftp.mizi.com/~ganadist/unzip-locale.diff
ah.. "Results" reports are broken :( i take screenshot. before patch: http://ftp.mizi.com/~ganadist/unzip-unpatched.png after patch: http://ftp.mizi.com/~ganadist/unzip-patched.png
nice, have you submitted it to unzip authors? ftp://ftp.info-zip.org/pub/infozip/FAQ.html#zip-bugs http://www.info-zip.org/zip-bug.html
I reported zip-bug form, and recieved answer. ---- Thank you! We currently don't have a full-time UnZip maintainer, but I have saved your patch and screenshots in my 6.0-patch-collection directory, so at least they won't be lost. (No clue when 6.0 might be released, but probably not before the middle of next year.)
+if(!strncmp(lang, "ru", 2)) return "KOI8-R"; +if(!strncmp(lang, "uk", 2)) return "KOI8-U"; This strings is broken. If russian locale is ru_RU.UTF8? I seeing you patch and correct for cyrilic unicode locale
Oh!!! Sorry!!! But if russian codepage is cp1251 ?
any update on the patch so i can include it?
updated russian's charset to CP1251 from KOI8-R. can get from same url :)
could you please update it for 5.52 ?
(In reply to comment #8) > could you please update it for 5.52 ? > updated patch for 5.52-r1 ebuild at same url.
i dont think that's quite how you want to do it ... i'm pretty sure you want to change the Ext_ASCII_TO_Native() macro instead of that "#if 0" stuff also, unless i read the patch wrong, basing the zipfile input on $LANG doesnt make any sense ...
here's a working patch https://bugzilla.altlinux.org/attachment.cgi?id=1402
if machine is x86_64, this patch will not work. in x86_64(CHOST="x86_64-pc-linux-gnu"), ebuild sets TARGET to linux_noasm. i have posted updated patch.
Created attachment 127717 [details, diff] modified for x86_64
Created attachment 127718 [details] add epatch for local patch
*** Bug 204257 has been marked as a duplicate of this bug. ***