PDF::Reader::MalformedPDFError - after update to v2.10.0
kserhiyus opened this issue · 1 comments
Hello,
The gem is great and I'm quite a power user of it.
However after update v2.9.2 to v2.10.0 some of my PDFs fail to be processed.
I did check those failing PDFs in several online validators and there were no issues found.
The error i get comes from here: https://github.com/yob/pdf-reader/blob/main/lib/pdf/reader/cid_widths.rb#L55
Error full trace:
/opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/cid_widths.rb:55:in `parse_second_form': CidWidths: 3 must be less than 3 (PDF::Reader::MalformedPDFError)
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/cid_widths.rb:37:in `parse_array'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/cid_widths.rb:22:in `initialize'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/width_calculator/composite.rb:17:in `new'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/width_calculator/composite.rb:17:in `initialize'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:146:in `new'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:146:in `build_width_calculator'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:49:in `initialize'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:214:in `new'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:214:in `block in extract_descendants'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:213:in `map'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:213:in `extract_descendants'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/font.rb:48:in `initialize'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:393:in `new'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:393:in `block in build_fonts'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:392:in `each'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:392:in `map'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:392:in `build_fonts'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page_state.rb:30:in `initialize'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/item_receiver.rb:21:in `new'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/item_receiver.rb:21:in `page='
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/validating_receiver.rb:258:in `call_wrapped'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/validating_receiver.rb:24:in `page='
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:268:in `block in callback'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:267:in `each'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:267:in `callback'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdf-reader-2.10.0/lib/pdf/reader/page.rb:158:in `walk'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:37:in `block in extract_analyze_merge'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:34:in `collect'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:34:in `extract_analyze_merge'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/lib/pdf/processor.rb:27:in `block in <class:Processor>'
from (eval):34:in `instance_exec'
from (eval):34:in `__dry_initializer_initialize__'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/dry-initializer-3.0.4/lib/dry/initializer/mixin/root.rb:7:in `initialize'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/bin/pdfcb:81:in `new'
from /opt/bitnami/ruby/lib/ruby/gems/2.6.0/gems/pdfcb-0.5.1/bin/pdfcb:81:in `<top (required)>'
from /opt/bitnami/ruby/bin/pdfcb:25:in `load'
from /opt/bitnami/ruby/bin/pdfcb:25:in `<main>'
In fact in order to keep up with an update i tweaked this line
raise MalformedPDFError, "CidWidths: #{first} must be less than #{final}" unless first < final
to be
raise MalformedPDFError, "CidWidths: #{first} must be less than #{final}" unless first <= final
and for me all works as before the update.
Here are the failing PDFs:
https://assets.publishing.service.gov.uk/media/5c640a8ded915d04148c31b0/Mr_J_Szymaniak_v_Jason_Hunt_and_Mardi_Hunt_trading_as_Crazy_Bear_Farm_and_Farm_Shop_-_3304471-2018.pdf
https://assets.publishing.service.gov.uk/media/5de917a2e5274a06d71f0413/Mr_J_Szymaniak_v_Jason_Hunt___Mardi_Hunt_TA_Crazy_Bear_Farm_and_Farm_Shop_-_3304471-2018_Judgment.pdf
Regards,
Serhii
Thanks for the clear bug report.
That particular raise
was added between v2.9.2 and v2.10.0, so this sounds like a bug and I suspect your fix is what we need. Are you up for opening a PR and I'll get it merged.