Dynamic XFA flatten?
leviwilson opened this issue · 2 comments
Is there a way to "flatten" a PDF that has dynamic XFA form data? For our use case, we split the PDF into multiple pages so we can do some downstream processing. At this point, we do not need to manipulate any form data so we just want to "flatten" the PDF with its values and save it as a normal PDF (not a dynamic xfa document). This way, our application can render the PDF in pdfjs.
Is there any way we can do this with combine_pdf
? Apologies if the question is unclear as I'm not 💯 familiar with the PDF formats.
In addition to this, is there a way to detect if a file is a dynamic XFA form? It looked like in the catalogs there was an :AcroForm
key, but the catalog doesn't look like it's publicly exposed and wasn't sure how I could reliably determine if it was one of these types of files.
re: the 2nd question I had about detecting if it was an xfa_form?
, was curious if something like this is sufficient:
pdf = CombinePDF.load(path)
!pdf.send(:get_existing_catalogs).dig(0, :AcroForm, :referenced_object, :XFA).nil? # => true if :XFA has a value
Hi @leviwilson ,
Thank you for your question. I am sorry to say I don't have good news about form flattening / baking.
CombinePDF attempts to minimize the data it actually needs to parse. For this reason, PDF streams (the contents of the pages) are rarely - if ever - parsed. The closest CombinePDF comes to touching pre-existing content is by renaming the metadata to avoid name collisions when combining PDF files.
For this reason, at the moment, it is impossible to "flatten" PDF forms and make them properly immutable.
As for detecting the a form, rebuilding the catalog object is okay but suboptimal when the catalog is built using the private @forms_data
variable that you could probably test for using the read only attribute forms_data
(see documentation here).
Good luck!
Bo.