As of December 6th, 2012, there is a new version of PDFtk. It would be helpful for projects that we do with PDFtk if it were updated to this new version due to the stream parsing fix.
Reproducible: Didn't try
Created attachment 336344 [details]
Consider adding the attached patch to solve bug as outlined here http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=687669)
> The actual bug is in the method PRTokeniser#nextValidToken. This method has
> the feature that it treats an indirect object reference (such as 24 0 R) as
> a single token. Therefore when it sees a number, it has to look ahead to
> see if the number is actually the start of an indirect object reference.
> If, however, the object stream ends during this lookahead then it would
> wrongly fail. So it will go wrong whenever the last object in an object
> stream is a number. It is perhaps surprising that this causes so apparently
> few problems!
Assembled a quick ebuild (new version + patch) and it works flawlessly.
I highly suspect it violates some rules, as the $S variable has been set to a different directory (the new patch requires to be applied in the parent directory). Compilation works via horrible hack: cd pdftk; compile; cd .. .
Created attachment 336346 [details]
Created attachment 340858 [details, diff]
Patch to create ebuild with all dependend patches.
Lots of activity upstream; now version 2.01.
2.01 - June 5, 2013 - [source code] - [windows installer] - [mac installer] - [pdftk license]
Fixed an uncompress bug introduced in 2.00 that corrupted some image streams.
Updated the Windows pdftk.exe compiler settings to remedy an elusive NullPointerException reported in the field. This problem first appeared in version 2.00.
2.00 - May 22, 2013
Added AES decryption of input PDFs. The 'owner' password is still required when decrypting any PDF.
Added merging of bookmarks/outlines when merging full PDFs.
Added new rotate operation, which is a convenient way of rotating select pages of a single PDF.
Added new dump_data_annots operation. Currently it reports only link annotation information.
Added new need_appearances output option. Use this when filling a form with non-ASCII text to ensure the best presentation in Adobe Reader/Acrobat. It won't work when combined with the flatten option.
Improved the compress option so that output PDFs are more compact and efficient.
Added page media information to dump_data output: page rotation, page media bounds and page crop bounds.
Improved the performance of dump_data so it works better with very large PDFs.
Improved the memory management in the Windows binary. This fixes the rare "Too many heap sections" error.
Fixed a bug where form fields with multiple values were not being properly reported by dump_data_fields.
Fixed a burst bug that was corrupting the output PDF pages.
Fixed an input bug to allow interactive prompting of both the user and owner passwords.
Fixed a burst bug so that doc_data.txt is now output to the same directory as the PDF's pages when an output directory is given.
Fixed a bug where indirect references to the PDF ID in the trailer would cause a crash.
Added a test to fill_form so it checks that an input PDF is a form before trying to fill it with data.
Added a return value of 3 for warnings 'PDF information not added' or 'PDF form not filled.'
Improved the error message for cat page range errors.
Fixed the error report when an input page number is out of range.
Fixed a burst bug where document metadata wasn't being copied properly to the output PDFs.
Updated the Bouncy Castle library to 1.48.
When using the cat operation, the output PDF version number is now set to the maximum PDF version of all of the input PDFs. If any of the input PDFs have PDF extension levels, then the greatest extension level is also copied to the output PDF.
Upgrade to 2.01 could also solve issues when parsing PDF v1.5 documents (\pdfminorversion=4 workaround). This also affects app-office/impresive.
This bug has never been clearly identified, but it stems from an outdated library iText in pdftk. http://forums.gentoo.org/viewtopic-t-956766-start-0.html
I wasn't able to determine which version of iText library pdftk-2.01 is using , but it's a different one, so hopefully newer.
Created attachment 351518 [details, diff]
updated *FLAGS patch for 2.01
I have successfully compiled pdftk-2.01 using these steps:
- renamed pdftk-1.45.ebuild -> pdftk-2.01.ebuild
- updated *FLAGS patch (attached)
As for the ebuild and patches submitted by Mao PU
- the InputError patch has been applied by upstream and is included in 2.01
- I'm not sure about the LDFLAGS nad Makefile patch, I rather used the one made by Gentoo devs. Perhaps a Gentoo dev should look at both patches and consider a compromise.
- the nodrm patch is a nice find, but I'm not sure if it won't introduce a bug, because if the PDF is encrypted, is pdftk able to read it and won't it just fail, resulting in a bug?
*** Bug 479976 has been marked as a duplicate of this bug. ***
+ 10 Aug 2013; Tom Wijsman <TomWij@gentoo.org> +files/pdftk-2.02-flags.patch,
+ Version bump to 2.02. Fixes bug #450080, reported by Adam Randall,
+ 7v5w7go9ub0o and Jeroen Roovers; applied patch by Ondrej Grover with small
+ fixes to make it work with 2.02 instead of 2.01 as well as to not turn
+ warnings into errors to avoid breakage without purpose.