Compiles the Dataform graph and executes SQLFluff on the compiled SQL files, with the option to get a diff-quality score.
This was a 2am thought and a Sunday-afternoon project while unbelievably hungover so there's some work to do (it's one big file)
py fluffer.py --mode=[fluff,diff] --format=[human,github-annotations etc]
Or, for the lols: make fluffer
Requires:
- Node (
npm i
) - Dataform CLI (
npm install -g @dataform/cli
) - Python 3.11
Yep it's primarily a Node project with a Python script. SQLFluff doesn't support Dataform and the Dataform built in-linter kind of sucks.
It uses sql-formatter
, which is great, but the rules are hardcoded into dataform/cli
and I don't want to do Google's work. It's also not as good as sqlfluff.
Well...
We want to commit /queries in, because the benefits of doing so are:
- Comparing branch sql to main and diffing
- Being able to use the compilation elsewhere (Sonarcloud, other static analysers)
The security downside is the PROJECT_ID, but that is in dataform.json
anyway. When we move to DF 3.0 and release configs via Terraform, we can set dataform.json
as some base values and override via TF..
100%.
Yeah its great. It will struggle on JS-inserted code by Dataform, because Dataform doesn't do any indentions. Easy to fix - we can just put a method into Storm that does lpad and sql-formatter in one, which isn't the worst idea in the world.
So we could try it on:
- BDP Utils (most programmatic SQL)
- BDP Core (most complex JS)
- Sales Online (most complex SQL)
Kind of. We will be assessing the queries directory rather than /definitions, which is why I've put a relative link at the top of each file, so there's a bit of gymnastics to do.
TODO:
- Tests!
- Type checking etc
- Probably more than 1 file lol
- Speak to AB about private package store
- Requirements.txt probably isn't fully up to date