Feature: ability to customize output via templates
moltar opened this issue · 7 comments
It'd be nice to be able to customize the output template to set the defaults.
Right now the output looks like:
version: 2
models:
- name: users
description: ''
columns:
- name: user_id
description: ''
- name: created_at
description: ''
But I'd like it to be (to match our org style):
version: 2
models:
- name: users
description: |
Description
columns:
- name: user_id
description: |
Description
tests:
- not_null
- name: created_at
description: |
Description
tests:
- not_null
Maybe a project-level Jinja macro could be used?
I hear you on this. Customizable yaml formatting was something that was briefly mentioned during our initial development work.
The dumping to yaml happens here with PyYAML's yaml.safe_dump
function.
There are some options that can be fed to that function. For example, we currently use the sort_keys=False
option to list the columns in the data warehouse order rather than alphabetical order. I'm not sure the options are customizable enough to match you org style (although, I'd be happy to be proven wrong here because I haven't looked very deep into it myself).
As for the project-level macro idea, how do you envision the implementation?
As for the project-level macro idea, how do you envision the implementation?
Tbh, no idea. I am not familiar with dbt internals, nor Python. But I did see that dbt-invoke
installs it's own macro, which gave me the idea :)
I imagined it being a jinja template that generates produces a file (YAML).
Earlier I read this issue and thought about it for a while. My current opinion is that formatting yaml
is a task worthy of a specialised package, and not necessarily the responsibility of this package (developers). Imho, formatters are born to serve this purpose (VSCode formatter is superficial, though).
@Gregory108 but it's not just about formatting, it's about setting the defaults too.
@moltar What things are defaults but not formatting?
@moltar What things are defaults but not formatting?
Please take a look at my example.
I have added:
- Description pipe and placeholder text
- Tests
@moltar
Default pipe and placeholder (other than '' or "") seem to me like extremely rare needs and not the task of this package (but I am not an author). Tests presets are definitely not the scope of the package as they convey variable-specific logic.
Moreover, blanket testing for not_null is an extremely rare requirement. 99.99% of domains I met have NULL as an appropriate value in at least one and, usually, most variables
Imo, the job of this package is to fill documentation with what the program can know from data+existing code.
What you ask for is either formatting or injecting with customisations that are domain and task-specific human knowledge. You can achieve this with simpler dedicated tools - "replace all" after file generation. I'd wrap it into a VSCode macro if you have 30+ models or write a script if there are hundreds of model docs to refactor.