Validate Data Quality in PostgreSQL Using SODA 🥤
- Create and activate Python venv
python3 -m venv venv_name
source /venv_name/bin/activate
- clone the repo and navigate to the repo
pip install -r requirements.txt
- soda-core
- soda-bigquery
pip install -i https://pypi.cloud.soda.io soda-postgres
- create an account in soda_cloud and
create an API from profile section
- save the API
- configuration
- rename the
sample_configuration.yml
file to configuration.yml
- configuration.yml
- update the postgres config
- update the
soda_cloud
section with soda API
- run the command to check config and DB connection
soda test-connection -d my_postgres_source -c configuration.yml -V
- checks.yml
- update the
dataset_name
after checks for
according to your postgres schema
- run the following cmd (everytime you need to run the cmd when you update the
checks.yml
file)
soda scan -d my_postgres_source -c configuration.yml checks.yml
- Go to your soda cloud profile and check the dashboard
- DONE 🎯