opensanctions data pipeline
The codebase for OpenSanctions, an open-source database of sanctions data, politically exposed persons, and other entities of interest. This repository contains the code used to parse, clean, and deduplicate source data and build the combined database.
OpenSanctions uses Follow the Money, a JSON-based anti-corruption data model, as a common target for all crawlers. Additonal exports into CSV and JSON formats are planned.
- opensanctions.org
- Technical documentation (readthedocs.org)
- Data sources roadmap
- Data licensing
- Contact us
Technical overview
Repository layout:
opensanctions/
: Python project with data extraction and cleaning componentsdocs/
: Sphinx technical documentation
Related repositories:
- opensanctions/site: web site for the OpenSanctions project, contains TypeScript React components for rendering FtM data.
- opensanctions/yente: API matching and entity search service.
Daily data extraction and processing runs on GitHub Actions. Status:
Licensing: code is MIT-licensed, content & data is CC 4.0 Attribution-NonCommercial.