Experimental project on managing dbt models on Apache Airflow.
This project has a dependency on Adobe Analytics data being load on BigQuery as described: https://github.com/konosp/adobe-clickstream-dbt
There two sample files in the misc/ folder; profile-demo_sample.yml and service_account_key_sample.json. There files are needed for the dbt/BigQuery configuration and for the BigQuery service account that runs within the docker container.
The files need to be renamed so that "_sample" is removed. For example:
- profile-demo_sample.yml -> profile-demo.yml
- service_account_key_sample.json -> service_account_key.json
Then as part of the dockefile command below, the files are utilised:
COPY misc/ /project/misc/
In order for the Google Cloud service account to be able to run the jobs, the following permissions are needed:
- BigQuery Data Editor
- BigQuery Job User
- BigQuery User
Build your docker image and use 'dbt-airflow' as image name
docker build -t dbt-airflow .
Run the Docker container, expose port 8080 and pass as an argument the dbt project url:
docker run -it --rm -p 8080:8080 dbt-airflow https://github.com/konosp/adobe-clickstream-dbt.git