/newscrawl

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

usage

Manually edit the config file, adding/removing sources and dates.

Install the required modules with pip:

pip install -r requirements.txt

Create the output and temporary files directories specified in the confi.json file, e.g.:

mkdir out tmp

Run the script:

./newscrawl.py