Gentoo Websites Logo
Go to: Gentoo Home Documentation Forums Lists Bugs Planet Store Wiki Get Gentoo!
Bug 200995 - app-arch/zip missing flags for >2GB files
Summary: app-arch/zip missing flags for >2GB files
Status: RESOLVED FIXED
Alias: None
Product: Gentoo Linux
Classification: Unclassified
Component: New packages (show other bugs)
Hardware: All Linux
: High normal (vote)
Assignee: Gentoo's Team for Core System packages
URL: http://www.info-zip.org/FAQ.html#limits
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-02 13:55 UTC by Tim Weber
Modified: 2007-12-04 07:52 UTC (History)
0 users

See Also:
Package list:
Runtime testing required: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Weber 2007-12-02 13:55:56 UTC
This is basically the same as bug #104315, which was for app-arch/unzip, so I'll keep it short.

Trying to add a file that's larger than 2GB to an archive fails with "zip warning: name not matched". If you compile zip with "-D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE" added to your CFLAGS, it works.

I've tried to add "append-lfs-flags" to the top of src_compile, but it seems to be ignored. Shouldn't be too hard to fix for someone who knows Info-Zips build system, I guess.
Comment 1 Tim Weber 2007-12-02 14:32:08 UTC
Sorry, seems like it's not _that_ easy:

$ zip -9 foo.zip foo.mov
  adding: foo.mov
zip I/O error: Invalid argument
zip error: Input file read failure (was zipping foo.mov)

Looks like I've hit a limitation of some underlying library. Since catting the file to /dev/null works fine, and scp'ing it worked as well, I suppose it's something Info-ZIP related.

On their FAQ (linked in the URL field) they are talking about that you need a glibc that can handle these files as well. Recompiling glibc-2.7 with the aforementioned CFLAGS resulted in an error, though ("ftello.c:66: error: '__EI_ftello' aliased to undefined symbol '__GI_ftello'"), but that's most likely a completely different bug.

On a second thought, since scp and cat worked fine, I think my glibc is not guilty. So I'm afraid I can't provide a solution here and leave it up to the mighty Gentoo Developers to enlighten use about what needs to be done to fix this, or maybe what other tool there is to create ZIP files which are readable under, you know, that "other" operating system.
Comment 2 SpanKY gentoo-dev 2007-12-02 19:51:29 UTC
fixed with zip-2.32-r1, thanks

http://sources.gentoo.org/app-arch/zip/files/zip-2.32-build.patch?rev=1.1
Comment 3 Tim Weber 2007-12-03 08:29:28 UTC
Thanks for the fast response. However, it's not really fixed. There seem to be two different issues involved here:

1) "zip warning: name not matched"
This error occurs instantly when hitting the return key after a command like "zip -9 foo.zip largerthan2gb.mov". This error is fixed by adding the three "-D" flags mentioned above or by using the patched zip-2.32-r1.

2) "zip I/O error: Invalid argument"
This error occurs at some time during the compressing and is not yet fixed, at least not on two of my systems. At first I guessed that it's the magical 2GB barrier that was hit while reading the input file. But then I created an strace:

File descriptor 3 is the output file, fd 4 is the input. Strings have been cut by me even more than strace already cut them.

read(4, "r\265\3574r\265\361Ur\265\363"..., 32768) = 32768
write(3, "\25\214\237\372\nV\253\276\202"..., 16384) = 16384
write(3, "\332\260\0\3367\26 |\313"..., 16384) = 16384
read(4, "\2406c<\2406f=\2406h\362"..., 32768) = 17624
read(4, "", 15144)                      = 0
write(3, "\375\243\256\232\376_\355\232\376"..., 16384) = 16384
close(4)                                = 0
write(3, "\203]\216\276\301\341\200\276a"..., 3346) = 3346
_llseek(3, 0, [0], SEEK_SET)            = 0
write(3, "PK\3\4\24\0\2\0\10\0\274b\202"..., 79) = 79
_llseek(3, 18446744072207846753, 0xbfe12174, SEEK_SET) = -1 EINVAL (Invalid argument)
write(1, "\n", 1)                       = 1
dup(2)                                  = 4
fcntl64(4, F_GETFL)                     = 0x2 (flags O_RDWR)
fstat64(4, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 7), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f0a000
_llseek(4, 0, 0xbfe12188, SEEK_CUR)     = -1 ESPIPE (Illegal seek)
write(4, "zip I/O error: Invalid argument\n", 32) = 32
close(4)                                = 0
munmap(0xb7f0a000, 4096)                = 0
write(2, "\nzip error: Input file read fail"..., 79) = 79
close(3)                                = 0
unlink("zi19Amex")                      = 0
exit_group(11)                          = ?

I'm no C programmer, but here's my interpretation: As you can see, the input file is read to the end (the second "read" returns less than the buffer size). Zip notices that the file is finished, apparently jumps to the beginning of the output file to update some header information, but then it does an _llseek (the second one) with a huge offset on the output file (which was in this example about 2.8GB in size before the error occured). This returns EINVAL and causes everything to fail, even though the file had been compressed just fine.

For me this looks like it could be a Zip bug. The part where it clones stderr to fd 4 and then tries to seek on it (wtf?) seems very strange as well.

I've created a little test case to check whether the Zip on your system has the same strange behavior. The test also shows that it's not the input size that's causing the problems, but the output. It creates slightly over 2GB of data to zip, the first time from /dev/zero (resulting in optimal compression down to 2MB), the second time from /dev/urandom (resulting in pretty much no compression at all and therefore an error when writing). You'll need about 4GB of free disk space for that test. If you don't get an error with the /dev/urandom test, please try increasing the "count" parameter, it might be that you just got better compressable random data. :)

$ dd if=/dev/zero of=test.dat bs=1024 count=2097153
2097153+0 records in
2097153+0 records out
2147484672 bytes (2.1 GB) copied, 73.5148 s, 29.2 MB/s
$ zip -9 test.zip test.dat
  adding: test.dat (deflated 100%)
$ stat -c %s test.zip
2084235
$ dd if=/dev/urandom of=test.dat bs=1024 count=2097153
2097153+0 records in
2097153+0 records out
2147484672 bytes (2.1 GB) copied, 553.267 s, 3.9 MB/s
$ rm test.zip
$ zip -9 test.zip test.dat
  adding: test.dat
zip I/O error: Invalid argument
zip error: Input file read failure (was zipping test.dat)

I'm skipping emerge --info output until someone says "works for me".
Comment 4 SpanKY gentoo-dev 2007-12-04 07:23:55 UTC
append-lfs-flags adds the correct CPPFLAGS to the build

this bug is about 2gb support which is fixed ... please clone the bug with only relevant information about your other problem
Comment 5 Tim Weber 2007-12-04 07:52:02 UTC
You're right, sorry.

However, while researching about the other bug, I found out that it's a known problem upstream and that it will be fixed in Zip 3.0. If you read the FAQ page carefully, it even says so explicitly:

"recompiling with the -DLARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 options [...] will allow the utilities to handle uncompressed data files greater than 2 GB in size, as long as the total size of the archive containing them is less than 2 GB."

See also: http://www.info-zip.org/board/board.pl?b-zipbugs/m-1192622583/

I won't open another bug, because the only thing that could happen to it would be to be closed RESOLVED WONTFIX or something like that. The fact that Zip can't handle >2GB archives is now documented well enough in our Bugzilla, I think anybody stumbling over that problem will find an answer, even though no solution, here.