This utility can "explode" large PDFs into numerous smaller fragments
(files) based on a CSV-formatted index file which defines the names
and starting/ending pages of each section fragment. The section
fragments are each extracted by invoking
pdfjam
,
and the resulting files are named in a way which makes their source document and originating page numbers clear.
Additionally, LaTeX is used to
generate an index PDF which lists all the section fragments as
hyperlinks, for ease of navigation. This index is typically built
from the index.latex.erb
template but you can
use any LaTeX template written in eRuby
(.erb
) file syntax.
./bin/build.rb CSV LATEX-TEMPLATE PDF-DIR INDEX-DIR OUTPUT-DIR
See https://github.com/aspiers/book-indices for an example of how to write the CSV index files.
- https://git.zx2c4.com/realbook-splitter/tree/ (Python script)
- http://www.pdfsam.org/pdfsam-basic/ (cross-platform)
- https://github.com/trevorprinn/RealBookExtractor/wiki (Windows only)
Please edit this file and then submit a pull request if you know of any other similar software - thanks!