Semi-automatic pdftk command generator
Assemble pages from the input PDFs to create a new PDF. PDFTK merges pages of a PDF or splits PDF pages of documents. The order of the pages in the new PDF is specified by the order of the given page ranges. The page ranges will be created according to their regular expression.
Assuming you have a 6 page scanned document with OCR and its regular expression is to search for dates, in case on page 1 you find a date and the next different date is on page 4 then the list of commands generated by the program It would look like this:
pdftk my_original.pdf cat 1-3 output a_date1_my_original.pdf
pdftk my_original.pdf cat 4-6 output a_date2_my_original.pdf
Tested on macOS Catalina and Linux Ubuntu 20.04.
Required gems:
pastel, pdf-reader
If you are using macOS, the way to install the gems one by one is like this:
sudo gem install pastel
Open Terminal app or other console app and execute:
ruby pcg.rb /your/input_folder /your/output_folder 'YOUR REGEX'
- ruby 2.6.3p62 (2019-04-16 revision 67580) [universal.x86_64-darwin19]
- Jonathan Burgos Saldivia - on Github - jonathanburgossaldivia
This project is licensed under the Eclipse Public License 2.0 - see the LICENSE.md file for details