A simple tool to redact information from a batch of pdfs 📄
Git is the tool you will need to download the software from this website. To check that git is installed, open the terminal and run
git --version
If not installed, Xcode will prompt the user to install git through the command line tools. Press Install and agree to license.
This download is ~2GB in size. It can always be uninstalled at the end by running:
sudo rm -rf /Library/Developer/CommandLineTools
Next let's clone this repo using git! When you run the below command, it will create a folder called pdf-redactor
with the code inside it. Run the command somewhere such as a folder called projects
.
git clone https://github.com/ryan-saffer/pdf-redactor.git
If the authenticity of github.com is challenged, and asked 'Are you sure you want to continue connecting' type 'yes'
Node is a javascript runtime, and is needed to execute the code. You can download node from its website. If you already have node installed, please ensure it is version 8 or above. To verify node is installed, run the following command:
node --version
First, we need to move into the project directory.
cd pdf-redactor
Then we install all the dependencies:
npm install
Inside the folder pdf-redactor
, create a file called .env
:
touch .env
Ensure you can see hidden files on a mac by pressing cmd + shift + .
Open this file, and add the following line, and save the document:
PDF_TRON_KEY=REPLACE_WITH_LICENSE_KEY
❗ License key purposely hidden. You can obtain a license key from PDFTron
Now we are ready to run the script. To run it, run:
npm start
When you run the script, it will go through every file in the data/pdfs
folder, and redact any term found in the data/data.csv
file.
Be sure not to remove the first line
term
in thedata.csv
file
Every row in the data.csv
file will be searched for and redacted individually. This means if you want to redact the following address:
75 Bourke St, Melbourne VIC 3000
It is better to put the following in the data.csv
file:
75 Bourke St
75 Bourke Street
75 Bourke
Melbourne
VIC
3000
If you have a large amount of Microsoft Word documents you need to first convert to PDF, you can use this automator task