get PDF original metadata
MathieuDerelle opened this issue · 1 comments
MathieuDerelle commented
Is there a way to get the original metadata of a PDF we are opening with your gem
This line is overwritting directly Producer
combine_pdf/lib/combine_pdf/pdf_public.rb
Line 112 in 3226cf1
Could you expose the original metadata as original_info
or expose parser
maybe ?
boazsegev commented
Hi @MathieuDerelle ,
You could always use the CombinePDF::Parser
class manually before creating a new CombinePDF::PDF
object, allowing you to extract the information before it's overwritten.
As a quick sketch (untested):
parser = PDFParser.new(IO.read(file_name, mode: 'rb').force_encoding(Encoding::ASCII_8BIT))
info = parser.info_object.dup
pdf = PDF.new(parser)
puts info[:Producer] # => should contain original producer value
However, the producer should be updated if you save the PDF using CombinePDF. This makes it easier to track issues with PDF formatting.
Good luck!
Bo.