simonwoerpel
investigative data journalist | leak librarian | computer assisted investigative reporting | @investigativedata | ex @correctiv
@investigativedata Europe
Pinned Repositories
ftm-geocode
Batch parse and geocode addresses from followthemoney entities. Simply geocoding just address strings works as well, of course.
investigraph
etl pipeline, graphical explorer and general toolbox for investigations with follow the money data
dokukratie
Scraper for German democracy documents
farmsubsidy-store
Cleaning scripts and backend api for farmsubsidy website
about
information about me and projects i am involved in
annotate-and-chill
collaboratively annotate documents without hassle and share annotations without servers and clouds
archieml-md-pagebuilder-py
eijc18
mmmeta
runpandarun
A simple interface written in python for reproducible i/o workflows around tabular data via pandas DataFrame specified via yaml "playbooks".
simonwoerpel's Repositories
simonwoerpel/runpandarun
A simple interface written in python for reproducible i/o workflows around tabular data via pandas DataFrame specified via yaml "playbooks".
simonwoerpel/annotate-and-chill
collaboratively annotate documents without hassle and share annotations without servers and clouds
simonwoerpel/mmmeta
simonwoerpel/farmsubsidy.org-next
The aim of farmsubsidy.org is to obtain detailed data relating to payments and recipients of farm subsidies in every EU member state and make this data available in a way that is useful to European citizens.
simonwoerpel/memorious-sehrgutachten
Scrape public documents of "Wissenschaftliche Dienste des Deutschen Bundestags" via memorious into aleph.
simonwoerpel/aleph
Search and browse documents and data; find the people and companies you look for.
simonwoerpel/bundestag-dip-memorious
simonwoerpel/db-speed-shell-scraper
simonwoerpel/kleineanfragen
Collecting kleine Anfragen from Parlamentsdokumentationssystemen for easy search- and linkability
simonwoerpel/opensanctions-aleph-import
simonwoerpel/parteien.medienrevolte.de
some nltk stuff, but not seriously...
simonwoerpel/parteien.medienrevolte.de-static
static output for https://parteien.medienrevolte.de
simonwoerpel/simonwoerpel
simonwoerpel/simonwoerpel.github.io
homepage / blog
simonwoerpel/about
information about me and projects i am involved in
simonwoerpel/aleph-archive-rclone
scripts for incremental backup of aleph archive via rclone
simonwoerpel/alephclient
API client for Aleph, supports bulk entity and document upload.
simonwoerpel/bruecken.medienrevolte.de
http://bruecken.medienrevolte.de
simonwoerpel/clickhouse-sqlalchemy
ClickHouse dialect for SQLAlchemy
simonwoerpel/convert-document
A docker container for LibreOffice and unoconv, used to generate PDF files from office-type documents.
simonwoerpel/demo-week-2020
Demo-Webseite der 21 Projekte der 7. Förderrunde des Prototype Funds.
simonwoerpel/django-geogermany
Django App that provides models for German states, districts, municipalities and zipcodes
simonwoerpel/dotfiles
my dotfiles. vim, zsh, mutt & other stuff
simonwoerpel/farmsubsidy-sql-gui
simonwoerpel/ingest-file
Ingestors extract the contents of mixed unstructured documents into structured (followthemoney) data.
simonwoerpel/medienrevolte.de
jekyll-powered homepage
simonwoerpel/memorious
Distributed crawling framework for documents and structured data.
simonwoerpel/memorious-extended
simonwoerpel/pantomime
Python library for MIME type parsing, normalisation and grouping.
simonwoerpel/pubmed_parser
:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset