Is a small script that will index pdf files, after having been forced to do this, when realizing that the price that Adobe demands for the same service is 300$ per year.
To run:
python indexerRetro.py --parent [Abs path to parent dir of pdf] --file [pdf name] --output [file name]
It will then produce a txt file which contains all of the unique words present in the pdf document, complemented by the page that the words appear.
- Adapted to Windows
- Adapted to everything else