MeltanoLabs/tap-snowflake

bug: discovery unnecessarily goes through all schemas and tables on discovery

Opened this issue · 1 comments

We are running into an issue where tap-snowflake is taking about 90 seconds to execute meltano invoke custom-snowflake tap —discover. We are providing a select config, tables, and schema in the config so it should be a relatively quick execution.

A 90 sec execution is far too long. In the tap, it states that when we provide tables in the configuration, it should limit discovery to the schema and tables specified. However, based on logs, the tap is introspecting the entire database. The tap is going through every table & every view in every schema the user has access to. This seems like a tap-snowflake bug and we want to address it asap so our customers can have a better experience.

FYI, it seems like there has been a 6-month outstanding issue that would address this: #23

the configurations for your reference.

- config:
   account: <redacted>
   database: CUSTOMER_DB
   password: <redacted>
   schema: customer_schema
   user: <redacted>
   warehouse: CUSTOMER_WH
   tables:
    - customer_schema.table_name
  inherit_from: tap-snowflake
  metadata: {}
  name: custom-snowflake-tap
  schema: {}
  select:
  - customer_schema-table_name.*
  select_filter: []

Snippet of logs during discovery:

2023-11-02 18:33:56,016 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_google_analytics]
2023-11-02 18:33:56,085 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,085 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 0
2023-11-02 18:33:56,086 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,136 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,137 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,138 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_insights_proudmoments]
2023-11-02 18:33:56,252 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,253 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 0
2023-11-02 18:33:56,253 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,307 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,307 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,308 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_insights_proudmoments]
2023-11-02 18:33:56,466 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,467 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 6
2023-11-02 18:33:56,467 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,517 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,517 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,518 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_keyword_kabbage]
2023-11-02 18:33:56,615 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,615 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 2
2023-11-02 18:33:56,616 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,686 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,687 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,687 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_keyword_kabbage]
2023-11-02 18:33:56,765 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,765 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 0
2023-11-02 18:33:56,766 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:56,834 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,834 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:56,835 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_table_names */ TABLES IN test_mixpanel]
2023-11-02 18:33:56,935 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:56,936 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 3
2023-11-02 18:33:56,936 | INFO   | snowflake.connector.cursor | query: [ROLLBACK]
2023-11-02 18:33:57,016 | INFO   | snowflake.connector.cursor | query execution done
2023-11-02 18:33:57,016 | INFO   | snowflake.connector.cursor | Number of results in first chunk: 1
2023-11-02 18:33:57,017 | INFO   | snowflake.connector.cursor | query: [SHOW /* sqlalchemy:get_view_names */ VIEWS IN test_mixpanel]

I have the same issue with the exception that discovery phase never finishes for me.