/dolthub-etl-jobs

ETL jobs that DoltHub maintained that load public data into DoltHub.

Primary LanguagePythonApache License 2.0Apache-2.0

DoltHub ETL Jobs

This package contains legacy code used in an Airflow pipeline to update public databases under the dolthub organization on DoltHub. It also stores some adhoc scripts that were never meant to be run continuously but were used to import the data from its source. If you find an interesting database on DoltHub under the dolthub organization, you may find the code that created it here. If you are wondering, just ask us on our Discord.

DoltHub, the company, focused on making Dolt a full-fledged version controlled database. We deprecated our Airflow instance. These scripts are here for posterity sake.