Welcome to your new dbt project!

There is now an Exploration article in Medium and a more in depth one Lateral joins article in Medium

schema-ingestor-article (In Progress)

Demo code for an article. Download to an imdb_files folder in the parent of the repository the files of IMDB NONCOMMERCIAL DATASET

Also install depenencies in a virtual environment from requirments.txt. In my case

python.exe -m venv venv

venv\Scripts\activate.bat

pip3 install -r requirements.txt

Because of storage shortage the cleansed (or original) datasets are materialized as views

Using the starter project

Try running the following commands:

  • rm -rf database_files/dev.duckdb
  • dbt clean
  • dbt deps
  • dbt seed
  • dbt run
  • dbt test

Profiling

There is a jupyter notebook for profiling with jupysql.

For profiling with soda-core, execute (if the existing json does not satisfy you)

  • soda scan -d imdb_dataset -c configuration.yaml checks.yaml -V -srf soda_scan.json

alt text

Resources:

  • Learn more about dbt in the docs
  • Check out Discourse for commonly asked questions and answers
  • Join the chat on Slack for live discussions and support
  • Find dbt events near you
  • Check out the blog for the latest news on dbt's development and best practices