pdfkit generates PDFs with technical errors and incorrect PDF version
petervwyatt opened this issue · 0 comments
Bug Report
Generated PDFs often have incorrect PDF version and various other errors - this includes guide.pdf, out.pdf and kitchen-sink.pdf from the official documentation.
This causes PDFs to fail validation with various tools.
Description of the problem
pdfkit assumes PDF 1.3 which is a very old assumption given that PDF 1.3 is 24 years old and didn't even include basic transparency! Nowadays PDF 1.7 is a far more appropriate and safe assumption as it includes all image formats, transparency, cross-reference streams, object streams and encryption as well as everything in PDF 1.3.
Code sample
Attempt to validate guide.pdf, out.pdf or kitchen-sink.pdf using something like veraPDF Arlington validator or pdfcpu validate --mode=strict
. These PDFs contain transparency-related features introduced in PDF 1.4 (e.g. SMask, ca/CA constant alpha, transparency groups).
PS. kitchen-sink-accessible.pdf is OK because it is PDF 1.7.
Also from the examples folder: attachment.pdf has incorrect /CheckSum (ee29c0b3de890dd1dc89f37471de2810)
. The MD5 string should be 16 bytes (as per spec) - if you output it as a hex-string rather than a literal string (so /CheckSum <ee29c0b3de890dd1dc89f37471de2810>
) then it will be correct.
Solution
One option would be to add lots of scattered code to auto-check/upgrade options.pdfVersion
any time a key from an incompatible feature is used. A far easier and more performant solution is to fix the default version to PDF 1.7 since it is 100% backward compatible (i.e. every PDF <1.7 is also valid as 1.7).