Code to help convert the WA State Library ILS Reentry (https://wiki.sos.wa.gov/ILSRe-entry/) wiki resources to run offline
The Washington State Library has put together a great wiki full or resources that can be used for reentry by inmates. Due to a lack of internet access by inmates, this is an attempt to automate the conversion of the wiki to run fully offline in a prison environment.
Tested with MediaWiki 1.24. MediaWiki 1.3+ has known changes to the database structure that this script is not currently set up to handle.
- Downloads PDF files that are linked externally and makes a copy on the local server.
- Pulls external HTML pages and renders them as a PDF file on the local server.
- Updates links in the local database to point to offline files.
- Saves changed files only, saving processing time
- Multithreaded (default 10 threads) - 1200 links downloading/converted in 12 minutes vs 3 hours
- Windows - running under XAMPP with MediaWiki 1.4. Good for putting on WSL laptops.
- Planned (not available yet) - Run as a docker container to become part of the OPE project.
To run the python script which will download/fix links, you need python 3.6.? installed (python.org). Look at modules.txt in this folder for the modules you need to install. In addition, you will need a local copy of your own MediaWiki installation.
- Install python modules (e.g. pip install pdfkit)
- Rename
config-dist.py
toconfig.py
and modify with local configuration - Using PowerShell (may require as Administrator), type the command
python find_links.py