Column Type None Error - Field's Not Converting To Dimensions
sisu-callum opened this issue ยท 19 comments
When running dbt2looker --tag marts
on my mart models, I receive dozens of errors around none type conversions.
20:54:28 WARNING Column type None not supported for conversion from snowflake to looker. No dimension will be created.
Here is the example of the schema.yml
file.
The interesting thing is that it correctly recognizes the doc that corresponds to the model. The explore within the model
file is correct and has the correct documentation.
Not sure if I can be of any more help but let me know if there is anything!
Also tested a clean installation of 0.19.1 but that didn't resolve it!
@sisu-callum I can replicate this if a model has a schema.yml
but doesn't exist yet in the data warehouse. The type information relies on the table having been materialized. I need to add that to the docs.
Could that be the issue? Are you working on a branch where those tables don't exist?
I'll add some debugging statements just now and push and update, which should help find the problem.
@sisu-callum could you try again with version 0.6.2
: pip install --upgrade dbt2looker
Should give some more helpful error messages if the problem is missing tables in snowflake
Note: I'm using +persist_docs
on these tables to ensure the documentation lives in Snowflake as well. Could that be impacting things?
I've upgraded dbt2looker and those errors seems to be gone. The strange thing is that I'm still not getting any dimensions materialized within the view file, which implies the error isn't missing tables in Snowflake. The steps I took were:
dbt run
dbt compile
dbt docs generate
dbt2looker --tag marts
Still the blank view files. Anything else I can try to help?
I added another model to my list of tagged files just to test. It gets picked up within the model
file (along with the correct description) but the columns aren't getting picked up. I wonder if there's something with my manifest that is causing the issue.
@sisu-callum what's the exact output you're getting from dbt2looker
?
Is it silent or are you seeing something like:
WARNING Model model.hubble_analytics.notexist not found in catalog. No looker view will be generated. Check if model has materialized in bigquery at hubble-analytics-prod.dbt.notexist
I guess you can't share the manifest.json
and catalog.json
. Are you able to share some extracts for the models that aren't working?
e.g. with the jq tool:
cat manifest.json | jq '.nodes["model.hubble_analytics.pages"]'
cat catalog.json | jq '.nodes["model.hubble_analytics.pages"]'
I'm also using +persist_docs
but I haven't tested with snowflake so it could be!
Here is the output that I'm getting:
You're right in that I can't share those files but I can share a little bit here. Does this match yours? Just from this little part I drew out, I wonder if it's because data_types are showing as null.
"columns": {"objective_id": {"name": "objective_id", "description": "Lorem ipsum - removing description", "meta": {}, "data_type": null, "quote": null, "tags": []}, "objective_type": {"name": "objective_type", "description": "Lorem ipsum - why are the data_types set to null", "meta": {}, "data_type": null, "quote": null, "tags": []}
Your manifest.json looks similar to mine. The data_type
is also null
for me but it exists in the catalog.json
. How does that look in the catalog.json
?
Since you didn't get a warning I guess the table does exist in catalog.json
Here are the relevant columns from the same model in the catalog.json
"columns": {"OBJECTIVE_ID": {"type": "NUMBER", "index": 1, "name": "OBJECTIVE_ID", "comment": null}, "OBJECTIVE_TYPE": {"type": "TEXT", "index": 2, "name": "OBJECTIVE_TYPE", "comment": null}
I've got to step out for a bit but one thing I did just notice is that when I try to run the command you provided (just installed jq to test) it returns an error. This could be due to my novice skills with jq or perhaps this is where the issue is coming from.
The filepath for my catalog file is target
> catalog.json
That looks fine! I'll have another look and add some more debugging statements.
Sorry you're right, from the root of your dbt repo it should be
cat target/catalog.json | jq '.nodes["model.stg_prod_objective"]'
Interesting! That actually returned null
but when I prefixed the dbt project name to the model name it worked and returned results. Ex: cat target/catalog.json | jq '.nodes["model.my_dbt_project_name.stg_prod_objective"]'
. Would this issue potentially be caused by the fact that we don't use the default dbt project name in the dbt_project.yml
?
Here is the output of those two columns.
This is also a model where I am not persisting docs (we don't persist them in the stg layer) so that is why the comments are null. I tested on one of the actual models that persists docs and the results have the correct descriptions in the comment field. Does dbt2looker
pull the descriptions of columns from the catalog.json
or the manifest.json
?
@sisu-callum nice! Thanks so much for helping with this by the way.
dbt2looker
pulls comments from the manifest.json
not catalog.json
. That way I'd expect it to work without persist_docs
.
There's definitely something wrong in dbt2looker
with snowflake because you have all the information needed in your files. I'll improve the logging to help pinpoint the problem.
@sisu-callum can you update to dbt2looker==0.6.3
, I've added some debugging statements. Could you run:
dbt2looker --log-level DEBUG
against the models you want to test and share the output if there's nothing too sensitive in there
First of all, that is some crazy fast speed on new versions! Ran this on the tagged models and got some interesting outputs! Here is an example:
14:30:34 DEBUG Model model.my project name.stg_prod_objective has no typed columns, no dimensions will be generated. unique_id='model.my project name.stg_prod_objective' resource_type='model' database='My DB' db_schema='My Schema' name='stg_prod_objective' description='To do' columns={'objective_id': DbtModelColumn(name='objective_id', description='Lorum ipsum I've changed this part.', data_type=None, meta=DbtModelColumnMeta(looker=Dbt2LookerMeta(measures=[]))),
Looks like this kind of response for all of the columns is pretty consistent across all the models included. This has me thinking its related to the fact that the data type is None. Every single column inside of my measures has data_type
= None
I also got these errors at the beginning though. As I was typing this out I realized that my project uses a few dbt packages, which have their own dbt_project.yml
files, which is probably causing that issue. But not sure if this is the source of the error as dbt2looker
seems to be capturing all the right information except for data_type
. Not sure on the multiple catalogs though - I can only see the single version so not sure what might be causing multiple to be recognized unless they are hidden.
14:30:33 WARNING Multiple manifest.json files found in path ./ this can lead to unexpected behaviour
14:30:34 DEBUG Detected valid manifest at target/manifest.json
14:30:34 WARNING Multiple catalog.json files found in path ./ this can lead to unexpected behaviour
14:30:34 DEBUG Detected valid catalog at target/catalog.json
14:30:34 WARNING Multiple dbt_project.yml files found in path ./ this can lead to unexpected behaviour
๐คฆโโ๏ธ I've found it. It's because snowflake converts column names to uppercase and dbt2looker
treats the column names as case sensitive.
I just noticed now that in your manifest it's objective_id
and in the catalog it's OBJECTIVE_ID
My favourite kind of bug after unicode bugs!
Thanks so much for taking the time to share the outputs by the way (I can see it takes some effort to redact the info).
I'm going to push an update now.
@sisu-callum 0.7.0
has been released!
dbt2looker
now treats all columns as lower case. This is a bit of a heavy-handed solution for now but it works because postgres/redshift/snowflake/bigquery all have case-insensitive column names.
If this is the only problem then your column types should appear and start generating dimensions in the lookml
Nevermind, I see the comment in the code. With that, I think this is a resolved issue!
TIMESTAMP_TZ not supported (see https://docs.looker.com/reference/field-params/dimension_group)