- Install Python >= 3.4:
- https://www.python.org/getit/, double click to execute the installer
- Select "Add Python to PATH" then Install Now
- Hit "Next" or "Ok" to finish installation.
- Firefox Driver:
- Download FireFox Browser https://www.mozilla.org/en-US/firefox/new/ then install FireFox.
- Unzip folder of geckodriver
- Now we need to add GeckoDriver to PATH of window
- Press "Window" button and type Edit the system environment variables, hit Enter then in tab Advanced choose Environment Variables
- Then in System Variables, find Path then Double-click to edit. If you are using Window XP, type ";" (don't forget the semicolon) to add new Path. For example my directory is at "E:\SECGOV" so I need to add ";E:\SECGOV".
- In window of Edit environment variable, press Browse.. then choose the path of unzip GeckoDriver.
- Hit "Enter" to finish procedure.
- Install wkhtmltopdf:
- Run wkhtmltox-0.12.5-1.msvc2015-win64.exe
- Remember Path of program, usually C:/Program Files/wkhtmltopdf/bin
- Add PATH of wkhtmltopdf to System Variables like in second step
- Columns with "exact": match exactly words in "listofword.txt", lower and upper case are the same. "retirement" is different from "postretirement".
- Columns without "exact": "postretirement" and "retirement" both count as 1.
- install.bat install needed libraries. If you see "Windows Protected your PC", choose "More info" then "Run anyway"
- listofword.txt: define your search criteria
- Compustat.csv: please convert excel file to csv.
- RUN.bat: Double-click to run this file.
- download: folder contains download PDF files
- log: log file. If there is a bug, please send the log file and a screenshot to me.
Note If something interrupts the process, hit "Ctrl + C" many times to terminate the process.