Teradata/dbt-teradata

Adding support for Database in Manifest Json

Opened this issue · 3 comments

Describe the feature

Sometimes users need to have a reference to the database in manifest.json, in order to feed the manifest in 3rd party tools.
The Database category is not used in Teradata since the equivalent is schema, in the manifest the database is populated as Null.
The request from users is that if the Database is defined in the configurations, then the database is populated, this only for documentation matters.
For generating the database artifacts, only the schema will continue to be relevant, as it is right now.

Describe alternatives you've considered

Not applicable.

Additional context

This is only related to documentation and manifest file.

Who will this benefit?

Users that feed the manifest in other 3rd party tools.

Are you interested in contributing this feature?

I could.

I had this issue and posted in the chat in Slack and wasn't able to ever get it working.

https://getdbt.slack.com/archives/C027B6BHMT3/p1680535202992639

here was the response from the teradata person in the chat:

Hi there, since Teradata doesn’t implement the database -> schema hierarchy and has only databases instead, we had to decide how to map the database -> schema hierarchy to the Teradata world. The decision was to use the schema (as it’s part of the fully qualified relation name in dbt). When you configure dbt-teradata, you have two options:
Leave database field blank (translates to null).
Set database field to the schema name.
It sounds like you use #1, and the thing your are trying to do doesn’t like it. Try approach #2 and see if that works better with the export step.

@Austin1 Thanks for the feedback. Actually I opened this issue to give the team visibility on the topic you noticed in Slack.
An update to the codebase is needed to enable the steps above to attain the desired objective.

I made modifications to the dbt-teradata codebase to assign the schema value to the database name by adjusting the dbt/include/teradata/macros/adapters.sql file as follows:

{% macro teradata__generate_database_name(custom_database_name=none, node=none) -%}
  {%- set schema_name = node.schema -%}
  {{ schema_name }}
{%- endmacro %}

This change enables the schema name to be returned whenever dbt-core attempts to retrieve the database name, successfully adding the database name to the manifest.json file instead of returning None.

Issue with catalog.json
However, I encountered an issue with the generation of the catalog.json file when running the dbt docs generate command. This problem is linked to the TeradataIncludePolicy in the dbt/adapters/teradata/relation.py file:

class TeradataQuotePolicy(Policy):
    database: bool = False
    schema: bool = True
    identifier: bool = True

According to this policy, dbt-core renders the list of nodes without including the database name in path_part, resulting in an empty list of nodes in the catalog.json file.

image

Consideration of Teradata's Schema Implementation
As Teradata does not implement a database -> schema hierarchy and instead relies on databases directly, we have chosen to use the schema as part of the fully qualified relation name in dbt. Altering the policy to set the database flag to True would cause all SQL references to adopt the hierarchy database_name.schema_name.relation_name, which is not valid in Teradata.

Impact on Integration with Other Tools
It's also important to consider the potential impact on dbt-teradata integrations with other tools. These tools rely on the manifest.json and catalog.json files to understand the dbt project structure (For example dagster dbt). If we add the database name to these files, other integration tools might mistakenly interpret the hierarchy as database_name.schema_name.relation_name, leading to further complications.

Conclusion
Given these observations, it seems prudent to adhere to the existing TeradataIncludePolicy and allow the database name to remain None in both the manifest.json and catalog.json files.