ARCHIVED

this scraper has been included into https://github.com/okfde/dokukratie/ and is maintained there

memorious-sehrgutachten

Other than the name suggests, it's not technical based on https://sehrgutachten.de but scrapes the website of the bundestag directly.

It downloads the files and metadata into a local folder.

The startdate and enddate parameters need to be set via env vars:

STARTDATE=2021-05-01 ENDDATE=`date '+%Y-%m-%d'` memorious run sehrgutachten

if running locally, make sure the memorious config env is set as well:

MEMORIOUS_CONFIG_PATH=src

git clone https://github.com/simonwoerpel/memorious-sehrgutachten.git
cd memorious-sehrgutachten
pip install -e .

All the magic happens in src/sehrgutachten.py and src/sehrgutachten.yml

To use the scraper for a production basis, a proper redis and psql should be used.

Please refer to the official documentation of memorious