Dashlane/dbt-invoke

Feature: ability to customize output via templates

moltar opened this issue · 7 comments

It'd be nice to be able to customize the output template to set the defaults.

Right now the output looks like:

version: 2
models:
- name: users
  description: ''
  columns:
  - name: user_id
    description: ''
  - name: created_at
    description: ''

But I'd like it to be (to match our org style):

version: 2
models:
  - name: users

    description: |
      Description

    columns:

    - name: user_id
      description: |
        Description
      tests:
        - not_null

    - name: created_at
      description: |
        Description
      tests:
        - not_null

Maybe a project-level Jinja macro could be used?

I hear you on this. Customizable yaml formatting was something that was briefly mentioned during our initial development work.

The dumping to yaml happens here with PyYAML's yaml.safe_dump function.

There are some options that can be fed to that function. For example, we currently use the sort_keys=False option to list the columns in the data warehouse order rather than alphabetical order. I'm not sure the options are customizable enough to match you org style (although, I'd be happy to be proven wrong here because I haven't looked very deep into it myself).

As for the project-level macro idea, how do you envision the implementation?

As for the project-level macro idea, how do you envision the implementation?

Tbh, no idea. I am not familiar with dbt internals, nor Python. But I did see that dbt-invoke installs it's own macro, which gave me the idea :)

I imagined it being a jinja template that generates produces a file (YAML).

Earlier I read this issue and thought about it for a while. My current opinion is that formatting yaml is a task worthy of a specialised package, and not necessarily the responsibility of this package (developers). Imho, formatters are born to serve this purpose (VSCode formatter is superficial, though).

@Gregory108 but it's not just about formatting, it's about setting the defaults too.

@moltar What things are defaults but not formatting?

@moltar What things are defaults but not formatting?

Please take a look at my example.

I have added:

  • Description pipe and placeholder text
  • Tests

@moltar
Default pipe and placeholder (other than '' or "") seem to me like extremely rare needs and not the task of this package (but I am not an author). Tests presets are definitely not the scope of the package as they convey variable-specific logic.
Moreover, blanket testing for not_null is an extremely rare requirement. 99.99% of domains I met have NULL as an appropriate value in at least one and, usually, most variables

Imo, the job of this package is to fill documentation with what the program can know from data+existing code.
What you ask for is either formatting or injecting with customisations that are domain and task-specific human knowledge. You can achieve this with simpler dedicated tools - "replace all" after file generation. I'd wrap it into a VSCode macro if you have 30+ models or write a script if there are hundreds of model docs to refactor.