MeltanoLabs/tap-snowflake

Noisey Log Messages

Opened this issue ยท 8 comments

Its possible that this is part of the SDK in which case I can transfer it over to the SDK repo but I'm getting a ton of logs that seem like theyre debug level. Context around queries isnt all that relevant for the tap user.

An example of part of my log:

2022-12-01 12:33:40,143 Skipping deselected stream 'b0027128a7536ff7fe527be749953dd8c8440644a_dbt_artifacts-fct_dbt__test_executions'.
2022-12-01 12:33:40,143 Skipping deselected stream 'b0027128a7536ff7fe527be749953dd8c8440644a_meltano_hub-fact_hub_metrics'.
2022-12-01 12:33:40,144 Skipping deselected stream 'b0027128a7536ff7fe527be749953dd8c8440644a_meltano_hub-fact_repo_metrics'.
2022-12-01 12:33:40,144 Beginning full_table sync of 'b0027128a7536ff7fe527be749953dd8c8440644a_meltano_hub-fact_variant_hub_metrics'...
2022-12-01 12:33:40,144 Tap has custom mapper. Using 1 provided map(s).
2022-12-01 12:33:40,164 Snowflake Connector for Python Version: 2.8.3, Python Version: 3.9.7, Platform: macOS-11.6.8-x86_64-i386-64bit
2022-12-01 12:33:40,164 This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2022-12-01 12:33:40,164 Setting use_openssl_only mode to False
2022-12-01 12:33:41,279 query: [select current_database(), current_schema();]
2022-12-01 12:33:41,446 query execution done
2022-12-01 12:33:41,446 Number of results in first chunk: 1
2022-12-01 12:33:41,447 query: [ROLLBACK]
2022-12-01 12:33:41,609 query execution done
2022-12-01 12:33:41,609 Number of results in first chunk: 1
2022-12-01 12:33:41,610 Snowflake Connector for Python Version: 2.8.3, Python Version: 3.9.7, Platform: macOS-11.6.8-x86_64-i386-64bit
2022-12-01 12:33:41,610 This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2022-12-01 12:33:42,481 query: [ROLLBACK]
2022-12-01 12:33:42,644 query execution done
2022-12-01 12:33:42,644 Number of results in first chunk: 1
2022-12-01 12:33:42,645 query: [select current_database(), current_schema();]
2022-12-01 12:33:42,834 query execution done
2022-12-01 12:33:42,834 Number of results in first chunk: 1
2022-12-01 12:33:42,835 query: [SHOW /* sqlalchemy:_get_schema_primary_keys */PRIMARY KEYS IN SCHEMA cicd_prod.b...]
2022-12-01 12:33:43,011 query execution done
2022-12-01 12:33:43,047 Number of results in first chunk: 0
2022-12-01 12:33:43,048 query: [SELECT /* sqlalchemy:_get_schema_columns */ ic.table_name, ic.column_name, ic.da...]
2022-12-01 12:33:43,726 query execution done
2022-12-01 12:33:43,726 Number of results in first chunk: 64
2022-12-01 12:33:43,728 query: [ROLLBACK]
2022-12-01 12:33:43,890 query execution done
2022-12-01 12:33:43,890 Number of results in first chunk: 1
2022-12-01 12:33:43,895 query: [SELECT b0027128a7536ff7fe527be749953dd8c8440644a_meltano_hub.fact_variant_hub_me...]
2022-12-01 12:33:44,112 query execution done
2022-12-01 12:33:44,112 Number of results in first chunk: 0

cc @kgpayne @edgarrmondragon

Biggest red flag from this output is that it's missing the logger name and log level that's creating that message: meltano/sdk#1232.

Those messages must be emitted at the INFO level, since that's what we configure by default for the root logger in https://github.com/meltano/sdk/blob/cf51acc6debf608da12ffbeef99aacc0a43e5a99/singer_sdk/default_logging.yml#L1-L15.

Once we figure out a better log format than {asctime} {message} which includes the logger name, we can look into setting the default level of the offending logger above to WARNING or above.

@pnadolny13 can I ask you a favor? Can you run the tap with SINGER_SDK_LOG_CONFIG set?

SINGER_SDK_LOG_CONFIG=sdk_logging.yaml tap-snowflake ...

where sdk_logging.yaml is

version: 1
disable_existing_loggers: no
formatters:
  default:
    format: "{asctime} {levelname} {name} {message}"
    style: "{"
handlers:
  console:
    class: logging.StreamHandler
    formatter: default
    stream: ext://sys.stderr
loggers:
  "":
    level: INFO
    propagate: no
    handlers: [console]

That should tell us which logger is emitting those messages, so we can make it quieter by default.

@pnadolny13 can I ask you a favor? Can you run the tap with SINGER_SDK_LOG_CONFIG set?

@edgarrmondragon Sure! Here it is:

2022-12-01 17:02:48,887 INFO tap-snowflake Skipping deselected stream 'ba719f6e9c249f7fcb64803d39d378d040cabbf9c_meltano_hub-fact_repo_metrics'.
2022-12-01 17:02:48,887 INFO tap-snowflake Skipping deselected stream 'ba719f6e9c249f7fcb64803d39d378d040cabbf9c_meltano_hub-fact_variant_hub_metrics'.
2022-12-01 17:02:48,887 INFO tap-snowflake Skipping deselected stream 'ba719f6e9c249f7fcb64803d39d378d040cabbf9c_meltano_hub-hub_bot_plugins'.
2022-12-01 17:02:48,889 INFO tap-snowflake Beginning full_table sync of 'ba719f6e9c249f7fcb64803d39d378d040cabbf9c_meltano_hub-hub_metrics_audit'...
2022-12-01 17:02:48,889 INFO tap-snowflake Tap has custom mapper. Using 1 provided map(s).
{"type": "SCHEMA", "stream": "ba719f6e9c249f7fcb64803d39d378d040cabbf9c_meltano_hub-hub_metrics_audit", "schema": {"properties": {"updated_date": {"format": "date-time", "type": ["string", "null"]}, "metric_type": {"type": ["string", "null"]}}, "type": "object"}, "key_properties": []}
2022-12-01 17:02:48,890 INFO snowflake.connector.connection Snowflake Connector for Python Version: 2.8.3, Python Version: 3.9.7, Platform: macOS-11.6.8-x86_64-i386-64bit
2022-12-01 17:02:48,890 INFO snowflake.connector.connection This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2022-12-01 17:02:49,495 INFO snowflake.connector.cursor query: [select current_database(), current_schema();]
2022-12-01 17:02:49,712 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:49,712 INFO snowflake.connector.cursor Number of results in first chunk: 1
2022-12-01 17:02:49,712 INFO snowflake.connector.cursor query: [ROLLBACK]
2022-12-01 17:02:49,858 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:49,859 INFO snowflake.connector.cursor Number of results in first chunk: 1
2022-12-01 17:02:49,859 INFO snowflake.connector.connection Snowflake Connector for Python Version: 2.8.3, Python Version: 3.9.7, Platform: macOS-11.6.8-x86_64-i386-64bit
2022-12-01 17:02:49,859 INFO snowflake.connector.connection This connection is in OCSP Fail Open Mode. TLS Certificates would be checked for validity and revocation status. Any other Certificate Revocation related exceptions or OCSP Responder failures would be disregarded in favor of connectivity.
2022-12-01 17:02:50,655 INFO snowflake.connector.cursor query: [ROLLBACK]
2022-12-01 17:02:50,814 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:50,814 INFO snowflake.connector.cursor Number of results in first chunk: 1
2022-12-01 17:02:50,815 INFO snowflake.connector.cursor query: [select current_database(), current_schema();]
2022-12-01 17:02:50,970 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:50,970 INFO snowflake.connector.cursor Number of results in first chunk: 1
2022-12-01 17:02:50,971 INFO snowflake.connector.cursor query: [SHOW /* sqlalchemy:_get_schema_primary_keys */PRIMARY KEYS IN SCHEMA cicd_prod.b...]
2022-12-01 17:02:51,189 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:51,189 INFO snowflake.connector.cursor Number of results in first chunk: 0
2022-12-01 17:02:51,190 INFO snowflake.connector.cursor query: [SELECT /* sqlalchemy:_get_schema_columns */ ic.table_name, ic.column_name, ic.da...]
2022-12-01 17:02:51,812 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:51,813 INFO snowflake.connector.cursor Number of results in first chunk: 64
2022-12-01 17:02:51,815 INFO snowflake.connector.cursor query: [ROLLBACK]
2022-12-01 17:02:52,011 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:52,011 INFO snowflake.connector.cursor Number of results in first chunk: 1
2022-12-01 17:02:52,013 INFO snowflake.connector.cursor query: [SELECT ba719f6e9c249f7fcb64803d39d378d040cabbf9c_meltano_hub.hub_metrics_audit.u...]
2022-12-01 17:02:52,252 INFO snowflake.connector.cursor query execution done
2022-12-01 17:02:52,252 INFO snowflake.connector.cursor Number of results in first chunk: 2

@pnadolny13 thanks! So, those are coming from https://github.com/snowflakedb/snowflake-connector-python.

You should be able to quiet them for the moment with

version: 1
disable_existing_loggers: no
handlers:
  console:
    class: logging.StreamHandler
    stream: ext://sys.stderr
    level: WARNING
loggers:
  snowflake.connector.cursor:
    handlers: [console]
    propagate: no

@edgarrmondragon do you have an idea of how to fix this? Is it a case of shipping a log config .yml with the tap, or should I be setting up the logger explicitly somewhere in the Tap ๐Ÿค” I am not very familiar with how logging is set up in the SDK yet ๐Ÿ˜…

Ref: SO question.

@kgpayne There's no easy way to do this at the moment so I created meltano/sdk#1373

I find this problem quite annoying and performance detrimental.

version: 1
disable_existing_loggers: false

formatters:
  simple:
    format: "{levelname} | {name} | {pathname} | {message}"
    style: "{"
  default:
    format: "%(message)s"
  json:
    (): meltano.core.logging.json_formatter
  key_value:
    (): meltano.core.logging.key_value_formatter
    sort_keys: False

handlers:
  console:
    class: logging.StreamHandler
    formatter: default
    stream: "ext://sys.stderr"
  null_handler:
    class: logging.NullHandler

root:
  level: WARNING
  handlers: [console]

loggers:
  snowflake.connector.cursor:
    level: CRITICAL
    handlers: [null_handler]
    propagate: no
  singer_sdk.metrics:
    level: INFO
    handlers: [console]
    propagate: no

I want the singer_sdk.metrics logs but not the rest during meltano el command, but its not taking effect and 1000s of lines are generated which are just throw away lines
@edgarrmondragon do you have any suggestion ?

Its all or nothing with this config. I can only control the root logger but other logger configuration is ineffective

@edgarrmondragon do you have any suggestion ?

Did you use the SINGER_SDK_LOG_CONFIG environment variable documented in https://sdk.meltano.com/en/v0.40.0/implementation/logging.html#custom-logging-configuration?

You can set that in meltano.yml and point it to the Singer logging YAML config:

# meltano.yml
env:
  SINGER_SDK_LOG_CONFIG: logging/singer_logging.yaml

https://github.com/edgarrmondragon/meltano-dogfood/blob/050f16a5b9c7c048e607a44705ec12335646e1e2/meltano.yml#L5-L6