/snowflake-starter

A _simple_ starter template for Snowflake Cloud Data Platform

Primary LanguagePython

snowflake-starter

A starter template for Snowflake Cloud Database

  • Snowflake offers a 1 month free trial, and with this repo you should be able to get a sense for the basics of Snowflake within an hour.
  • This template will create the necessary ROLE, USER, DATABASE, SCHEMA & WAREHOUSE(s) necessary to get up and running with Snowflake:

snowflake.png

Strongly recommend taking a looking at managing Snowflake with Terraform once you're familiar with the basics.

Usage

Requirements

1. Deploy

Copy first_run.sql into a worksheet as in the screenshot below and Run All.

Or use the CLI

snowsql -c <your_connection_name> -f first_run.sql

snowflake.png

Infrastructure Details

The following is created, as described in first_run.sql

├── DATABASES
│   ├── RAW                     # This is the landing pad for everything extracted and loaded
│   └── ANALYTICS               # This database contains tables and views accessible to analysts and reporting
├── WAREHOUSES
│   ├── WAREHOUSE_INGEST        # Tools like Stitch will use this warehouse to perform loads of new data
│   ├── WAREHOUSE_TRANSFORM     # This is the warehouse that dataform/dbt will use to perform all data transformations
│   ├── WAREHOUSE_REPORT        # BI tools will connect to this warehouse to run analytical queries
├── ROLES
│   ├── ROLE_INGEST             # Give this role to your Extract/Load tools/scripts to load data
│   ├── ROLE_TRANSFORM          # Give this role to Dataform/dbt to transform data, and Data Engineers
│   ├── ROLE_REPORT             # Give this role to BI tools / Analysts to query analytics data
├── USERS
│   ├── USER_INGEST             # eg: Stitch User
│   ├── USER_TRANSFORM          # eg: Dataform User
│   ├── USER_REPORT             # eg: Looker user

2. Test

Use the test_permissions.sql SQL to:

  1. create a base table in the RAW database, load a test row using the ROLE_INGEST role
  2. create a new table and view in ANALYTICS using the ROLE_TRANSFORM role
  3. query that view using the ROLE_REPORT role

NB replace <USERNAME> in the file with your login name

Or use the CLI:

snowsql -c <your_connection_name> -f test_permissions.sql

3. JSON

JSON is very well handled in Snowflake, and worth a look. The test_json.sql file runs through the flattening of raw JSON into a table.

Or use the CLI:

snowsql -c <your_connection_name> -f test_json.sql
  • Key to note is the RECURSIVE=>TRUE flag

4. User-Defined Functions

UDF allow you to create functions in SQL or JavaScript. The test_udf.sql file runs through the creation and testing of a SQL and JavaScript UDF. See the docs for more

Or use the CLI:

snowsql -c <your_connection_name> -f test_udf.sql

5. Tear Down

The first_run_drop.sql file will drop all objects created by first_run.sql

Or use the CLI:

snowsql -c <your_connection_name> -f first_run_drop.sql

6. SnowSQL-CLI

If you want to do this more than once, the SnowSQL CLI is great.

git clone https://github.com/mattarderne/snowflake-starter.git
cd snowflake-starter
snowsql -c <your_connection_name> -f first_run.sql

7. End to End Test

If the following script runs without error, then that is an end to end test... it should take about a minute. (change the <placeholders> in the file)

sh tests/run.sh

Other things

SnowAlert

SnowAlert is a project maintained by Snowflake that provides some useful system monitoring features. I like to use some of the queries they have created to monitor cost spikes.

The snowAlert.sql creates the views and runs the queries necessary to get alerts. Running it daily in Dataform/dbt is a nice way to get custom alerts to unusual spikes

snowsql -c <your_connection_name> -f utils/snowAlert.sql -o friendly=false -o quiet=true

Snowflake Inspector

If you'd like to keep track of the evolution of your Snowflake Data Warehouse, snowflakeinspector is a great tool to do just that. Query your metadata and paste the results into their tool and you'll get a nice explorable visualisation as below:

snowflakeinspector.png

Thanks

TODO

TODO: Snowflake Inspector

  • Schema:
    • explore "analytics" database for primary keys, analyse for similarity, unnamed primary keys, variables etc
    • make suggestions
  • Costs
    • Explore query history and build a recommendation for query optimisation
    • Visualise