rzip silently fails to compress files bigger than 4GB. There seems to be a 32bit overflow (even on 64bit architectures - also seen on AMD64). As soon as the file is bigger than 4GB, the compressed file size is somewhere filesize MODULO 4GB. Tested with a 6GB DVD image and a 9GB DVD Image. The rzip-author didn't react to this bug report. What's worse, the ebuild description states "compression program for large (sic!) files" -> *boom* Reproducible: Always Steps to Reproduce: 1. Get ISO-Image bigger than 4GB; get its MD5 sum 2. compress iso image with rzip -6 3. decompress compressed file, get its MD5 sum Actual Results: The compressed file is broken Expected Results: Being able to recover compressed data lossless Either fix, or document
Thanks for reporting, quite a valuable find for someone I'm sure. I will add a notice to the ebuild. Can you provide documentation to the upstream bug report that you filed please.
(In reply to comment #0) > Either fix, or document pkg_postinst() { ewarn "It has been reported that this tool will fail on files >4GB" ewarn "Please see https://bugs.gentoo.org/show_bug.cgi?id=217552 for more" ewarn "information." Added to ebuild. Thanks for reporting. Closing bug but please still comment with the $UPSTREM bug report that you filed so later users can investigate.
I ran into this a long time ago. What happened in my case is that the generated rz did contain all of the data, it's just that the header stored the expected file size as an int instead of a long or long long. All I had to do in my case was modify the source so that it ignored the stored file size and instead read until it hit EOF... I think that's what I did, this was a while ago. p7zip FTW! :-)
@Comment 3: so the rz-files are basically ok, but can not be extracted correctly. Would you be so kind and provide us your little patch? ;) would be nice, so one can at least extract that "broken" files. Thanks!
The bug is introduced by patching rzip with rzip-2.0-darwin.patch. The patch screws up rzips large file handling.
Created attachment 161489 [details, diff] Fixed the patch so that it doesn't screw up large file support. I haven't checked all the changes that the original rzip-2.0-darwin.patch does, but I removed the part that messed with the large file support. Please update the patch in portage with this version and remove the warning from the ebuild.
Created attachment 161756 [details, diff] Patch for rzip to enable uncompressing of archives created with the broken rzip. Please reopen this bug - this issue is definitely not fixed. If someone used gentoo's broken rzip to compress a large file, {,s}he may look for open bug reports. The attached patch implements an option '-l' that can be used to override the higher 32bits of the expected file size and thereby allowing the uncompressing of broken rzip-files. It should be okay to use a higher value than the correct one. Example: rzip -d -l 200 broken_archive.rz
(In reply to comment #7) > Please reopen this bug - this issue is definitely not fixed. If someone used > gentoo's broken rzip to compress a large file, {,s}he may look for open bug > reports. AKAIK, anyone can reopen bugs. can't remember though. Thanks for the patch, I'm hesitant to remove the darwin patch, because I don't have time at the moment to see what that does and what it will possible break.
Un-assigning from myself because for above mentioned reasons. (I also would like to test this and don't have any large files available at the moment)
I only had the option to "Leave as RESOLVED FIXED" on the closed bug report. If there is another way to re-open reports, please tell me. But why are you hesitant to remove the patch for it might break something (well, actually I didn't even ask for that but to replace it)?! It _already_ breaks rzip silently.
I can confirm that the original patch definitely breaks rzip, whereas the updated patch works just fine. I honestly don't see any reason why a *known broken* version of rzip should stay in the portage tree (even marked as stable!) while a version that is known not to contain this bug (but *maybe* some others) is not even added to the tree. I was just lucky to see the log-messages this time. From my point of view the current rzip-version in portage should at least be marked as unstable, or even better: masked completely and a version with this reworked patch should be added to the tree. @Stupid Bugzilla: thank you for the both of your patches! So I could at least recover everything from the broken archives. @Jeremy Olexa: for big files how about sth. like dd if=/dev/zero of=/some/place/with/space/testfile bs=1024k count=5024 to get a 5GiB file?
what's the status here? is the problem solved, for current in-tree versions of rzip?
Gentoo's rzip creates broken archives, I've just verified this with my own testing. Compressing with gentoo's version of rzip results in files that silently fail to decompress, in my case creating a 900 MB file when it should have created a 69 GB file. rzip is marked as stable, on amd64 at least. rzip should be masked as soon as possible.
i have masked app-arch/rzip. it is my understanding that one can use app-arch/lrzip on rzip compressed archives - is that correct? if so, we probably should just get rid of rzip and migrate people to lrzip... feedback welcome. thanks.
Why get rid of rzip? The upstream version works just fine. Please remove the gentoo-specific patch rzip-2.1-darwin.patch (or replace it with the one posted here) and set it stable for x86/amd64.
i have committed rzip-2.1-r1 with the updated darwin.patch and rzip-2.1-r2 with the rzip-handle-broken.patch on top of that. please test, test, test.
it would be awesome, if somebody could post some of their experiences with the unstable ebuilds here (-r1 and -r2). specifically i'd like to know if they work on existing files.
I've tested app-arch/rzip-2.1-r2 to unzip a broken archive created by app-arch/rzip-2.1. With app-arch/rzip-2.1-r2, runzip without any command line options produced a corrupted file, but "runzip -l 64" correctly uncompressed the file, with the following warning: Warning: The uncompressed size does not equal the expected file size. However if you used the -l option, this may be okay. So I believe that app-arch/rzip-2.1-r2 is behaving as expected on broken archives. My only comments are that: - the description of the -l option is perhaps not as clear as it should be. It is described as "-l nr set higher bits of the expected file length to nr". I don't think this makes it clear that the sole purpose of this option is to uncompress archives created by gentoo's older, broken rzip. - Is the -l option something that gentoo wants to maintain indefinitely? Upstream seems dead, but if there's another major release of rzip it may be difficult to port the -l option to the newer version.
I've also tested app-arch/rzip-2.1-r2 to compress and decompress the 69 GB file that app-arch/rzip-2.1 failed on. app-arch/rzip-2.1-r2 was able to compress and decompress the file with no special command line options. So it looks like app-arch/rzip-2.1-r2 is working properly.
(In reply to comment #18) > My only comments are that: > - the description of the -l option is perhaps not as clear as it should be. It > is described as "-l nr set higher bits of the expected file length to nr". I > don't think this makes it clear that the sole purpose of this option is to > uncompress archives created by gentoo's older, broken rzip. > - Is the -l option something that gentoo wants to maintain indefinitely? > Upstream seems dead, but if there's another major release of rzip it may be > difficult to port the -l option to the newer version. You're right. The explanation is probably to technical and it wasn't meant to be in the official ebuild. But how about adding a warning to gentoo's rzip printed when decompressing? Something like "Warning: Gentoo shipped a broken rzip for quite some time. During compression it didn't set the right file size, so if you have any reason to believe that your archive was compressed with an old Gentoo rzip, please refer to http://bugs.gentoo.org/show_bug.cgi?id=217552 for a patch to rescue your data. We apologize for the inconvenience."
Fixed version was already stabilized