datafold/data-diff

Snowflake connection timeout after successfull loading diff results into table

lokofoko opened this issue · 7 comments

Describe the bug
I am using windows 11 last version and have data-diff v0.10.1.
I am doing data-diff using a conf file between two tables in one schema in one db in Snowflake.
It works perfectly up to the point, where it doesn't stop working after the successful creation of a table with results in Snowflake and instead throws an error message and does multiple retries to restore connection. To return control of the terminal I have to use Ctrl + C and wait from 30 sec to 2 minutes.

I had a chat with Sung in chat from the docs.datafold.com site and he advised me to create an issue here and mention @dlawin.

Make sure to include the following (minus sensitive information):
22:53:40 DEBUG Applied run configuration: {'verbose': False, 'database1': {'driver': 'snowflake', 'user': '', 'password': '', 'account': '', 'database': 'DBT_DEVELOPMENT', 'schema': 'DATA_INTEGRATIONS', 'warehouse': 'TEST_WH_XL', 'role': ''}, 'table1': 'COMPARING_UNUSEDDATA_PROD_TABLE', main.py:310
'database2': {'driver': 'snowflake', 'user': '', 'password': '', 'account': '', 'database': 'DBT_DEVELOPMENT', 'schema': 'DATA_INTEGRATIONS', 'warehouse': 'TEST_WH_XL', 'role': ''}, 'table2': 'COMPARING_UNUSEDDATA_DEV_TABLE'}
22:53:44 DEBUG Running SQL (Snowflake): base.py:962
ALTER SESSION SET TIMEZONE = 'UTC'
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960
SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM information_schema.columns WHERE table_name = 'COMPARING_UNUSEDDATA_PROD_TABLE' AND table_schema = 'DATA_INTEGRATIONS'
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM information_schema.columns WHERE table_name = 'COMPARING_UNUSEDDATA_DEV_TABLE' AND table_schema = 'DATA_INTEGRATIONS'
Using algorithm 'joindiff'. main.py:519
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960
SELECT * FROM (SELECT TRIM("APN_UNFORMATTED"), TRIM("UNIQUE_ID"), TRIM("APN_SEQUENCE_NUMBER"), TRIM("FIRST_MTG_LENDER_NAME"), TRIM("FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT"), TRIM("PROPERTY_CITY"), TRIM("JUNIOR_MTG_ASSIGNED_LENDER"), TRIM("FIRST_MTG_REFINANCE_LOAN_IND"), TRIM("FIRST_MTG_ARM_CHANGE_INTERVAL"),
TRIM("FIRST_MTG_INTEREST_RATE_TYPE"), TRIM("ESTIMATED_EQUITY"), TRIM("DATAPROVIDERID"), TRIM("FIRST_MTG_ARM_NEXT_RESET_DATE"), TRIM("FIRST_MTG_AMOUNT"), TRIM("MAILING_CITY"), TRIM("PROPERTY_ADDRESS_FULL"), TRIM("JUNIOR_MTG_ORIGINATION_DATE"), TRIM("FIRST_MTG_ORIGINATION_DATE"), TRIM("OWNER_1_LAST_NAME"),
TRIM("FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE"), TRIM("FIRST_MTG_LOAN_PURPOSE"), TRIM("FIRST_MTG_ASSIGNED_LENDER"), TRIM("FIRST_MTG_FIXED_RATE_LOAN_IND"), TRIM("FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT"), TRIM("FIRST_MTG_LTV"), TRIM("DOWNPMTASSISTFGRANT_IND"), TRIM("FIRST_MTG_ARM_INITIAL_RESET_DATE"), TRIM("JUNIOR_MTG_LENDER_NAME"),
TRIM("FIRST_MTG_LOAN_TYPE"), TRIM("MAILING_ZIP_CODE"), TRIM("PROPERTY_ZIP_CODE"), TRIM("MAILING_ADDRESS_FULL"), TRIM("FIRST_MTG_ARM_CALCULATION_CHANGE"), TRIM("OWNER_2_FIRST_NAME"), TRIM("FIRST_MTG_MODIFIED_IND"), TRIM("FIRST_MTG_TERM"), TRIM("MAILING_STATE"), TRIM("REFERENCENR"), TRIM("FIRST_MTG_INTEREST_RATE"),
TRIM("FIRST_MTG_ARM_INDEX_TYPE"), TRIM("FIRST_MTG_ARM_CHANGE_FREQ"), TRIM("FIRST_MTG_VARIABLE_RATE_LOAN_IND"), TRIM("JUNIOR_MTG_SUBORDINATE_TYPE"), TRIM("LANDUSEDESCRIPTIONID"), TRIM("LASTUSED"), TRIM("COUNTYID"), TRIM("TAX_AMOUNT"), TRIM("PROPERTY_STATE"), TRIM("FIPS"), TRIM("OWNER_2_LAST_NAME"), TRIM("JUNIOR_MTG_EQUITY_LOAN_IND"),
TRIM("COUNTY_NAME"), TRIM("JUNIOR_MTG_AMOUNT"), TRIM("OWNER_OCCUPIED_INDICATOR"), TRIM("OWNER_1_FIRST_NAME") FROM "COMPARING_UNUSEDDATA_PROD_TABLE") AS LIMITED_SELECT LIMIT 64
22:53:47 WARNING Mixed UUID/Non-UUID values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.UNIQUE_ID, disabling UUID support. base.py:1102
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_NEXT_RESET_DATE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.PROPERTY_ADDRESS_FULL. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_ORIGINATION_DATE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ASSIGNED_LENDER. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_INITIAL_RESET_DATE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.MAILING_ADDRESS_FULL. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.OWNER_2_FIRST_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_INDEX_TYPE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.FIRST_MTG_ARM_CHANGE_FREQ. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_SUBORDINATE_TYPE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.OWNER_2_LAST_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.COUNTY_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_PROD_TABLE.JUNIOR_MTG_AMOUNT. It cannot be used as a key. base.py:1114
INFO [Snowflake] Schema = {'APN_UNFORMATTED': String_VaryingAlphanum(_notes=[]), 'UNIQUE_ID': String_VaryingAlphanum(_notes=[]), 'APN_SEQUENCE_NUMBER': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT': Text(_notes=[]), 'PROPERTY_CITY': String_VaryingAlphanum(_notes=[]), schema.py:12
'JUNIOR_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'MLG_SALE_PRICE': Decimal(_notes=[], precision=0), 'FIRST_MTG_REFINANCE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_CHANGE_INTERVAL': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_INTEREST_RATE_TYPE': String_VaryingAlphanum(_notes=[]), 'ESTIMATED_EQUITY':
String_VaryingAlphanum(_notes=[]), 'DATAPROVIDERID': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_NEXT_RESET_DATE': Text(_notes=[]), 'FIRST_MTG_AMOUNT': String_VaryingAlphanum(_notes=[]), 'MAILING_CITY': String_VaryingAlphanum(_notes=[]), 'PROPERTY_ADDRESS_FULL': Text(_notes=[]), 'JUNIOR_MTG_ORIGINATION_DATE': Text(_notes=[]),
'FIRST_MTG_ORIGINATION_DATE': String_VaryingAlphanum(_notes=[]), 'OWNER_1_LAST_NAME': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE': Text(_notes=[]), 'FIRST_MTG_LOAN_PURPOSE': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'FIRST_MTG_FIXED_RATE_LOAN_IND':
String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT': Text(_notes=[]), 'FIRST_MTG_LTV': Text(_notes=[]), 'DOWNPMTASSISTFGRANT_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_INITIAL_RESET_DATE': Text(_notes=[]), 'JUNIOR_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_LOAN_TYPE':
String_VaryingAlphanum(_notes=[]), 'MAILING_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'PROPERTY_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'MAILING_ADDRESS_FULL': Text(_notes=[]), 'FIRST_MTG_ARM_CALCULATION_CHANGE': Text(_notes=[]), 'OWNER_2_FIRST_NAME': Text(_notes=[]), 'FIRST_MTG_MODIFIED_IND':
String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_TERM': String_VaryingAlphanum(_notes=[]), 'MAILING_STATE': String_VaryingAlphanum(_notes=[]), 'REFERENCENR': Text(_notes=[]), 'FIRST_MTG_INTEREST_RATE': Text(_notes=[]), 'FIRST_MTG_ARM_INDEX_TYPE': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_FREQ': Text(_notes=[]),
'FIRST_MTG_VARIABLE_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'JUNIOR_MTG_SUBORDINATE_TYPE': Text(_notes=[]), 'LANDUSEDESCRIPTIONID': String_VaryingAlphanum(_notes=[]), 'LASTUSED': Text(_notes=[]), 'COUNTYID': Text(_notes=[]), 'TAX_AMOUNT': String_VaryingAlphanum(_notes=[]), 'PROPERTY_STATE': String_VaryingAlphanum(_notes=[]),
'FIPS': String_VaryingAlphanum(_notes=[]), 'OWNER_2_LAST_NAME': Text(_notes=[]), 'JUNIOR_MTG_EQUITY_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'COUNTY_NAME': Text(_notes=[]), 'JUNIOR_MTG_AMOUNT': Text(_notes=[]), 'OWNER_OCCUPIED_INDICATOR': String_VaryingAlphanum(_notes=[]), 'OWNER_1_FIRST_NAME': String_VaryingAlphanum(_notes=[])}
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT * FROM (SELECT TRIM("JUNIOR_MTG_SUBORDINATE_TYPE"), TRIM("REFERENCENR"), TRIM("PROPERTY_CITY"), TRIM("OWNER_2_LAST_NAME"), TRIM("JUNIOR_MTG_ASSIGNED_LENDER"), TRIM("UNIQUE_ID"), TRIM("JUNIOR_MTG_LENDER_NAME"), TRIM("FIRST_MTG_FIXED_RATE_LOAN_IND"), TRIM("OWNER_OCCUPIED_INDICATOR"), TRIM("PROPERTY_ZIP_CODE"),
TRIM("FIRST_MTG_MODIFIED_IND"), TRIM("PROPERTY_ADDRESS_FULL"), TRIM("FIRST_MTG_ARM_CHANGE_FREQ"), TRIM("FIRST_MTG_LOAN_TYPE"), TRIM("ESTIMATED_EQUITY"), TRIM("DOWNPMTASSISTFGRANT_IND"), TRIM("MAILING_ZIP_CODE"), TRIM("MAILING_ADDRESS_FULL"), TRIM("MAILING_STATE"), TRIM("OWNER_2_FIRST_NAME"), TRIM("OWNER_1_LAST_NAME"), TRIM("LASTUSED"),
TRIM("MAILING_CITY"), TRIM("APN_UNFORMATTED"), TRIM("FIRST_MTG_LOAN_PURPOSE"), TRIM("FIRST_MTG_VARIABLE_RATE_LOAN_IND"), TRIM("FIRST_MTG_ARM_NEXT_RESET_DATE"), TRIM("FIRST_MTG_ASSIGNED_LENDER"), TRIM("FIPS"), TRIM("OWNER_1_FIRST_NAME"), TRIM("FIRST_MTG_ARM_INDEX_TYPE"), TRIM("JUNIOR_MTG_EQUITY_LOAN_IND"), TRIM("FIRST_MTG_LENDER_NAME"),
TRIM("COUNTY_NAME"), TRIM("APN_SEQUENCE_NUMBER"), TRIM("PROPERTY_STATE"), TRIM("FIRST_MTG_INTEREST_RATE_TYPE"), TRIM("FIRST_MTG_ARM_INITIAL_RESET_DATE"), TRIM("FIRST_MTG_REFINANCE_LOAN_IND") FROM "COMPARING_UNUSEDDATA_DEV_TABLE") AS LIMITED_SELECT LIMIT 64
22:53:49 DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.JUNIOR_MTG_SUBORDINATE_TYPE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.OWNER_2_LAST_NAME. It cannot be used as a key. base.py:1114
WARNING Mixed UUID/Non-UUID values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.UNIQUE_ID, disabling UUID support. base.py:1102
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.JUNIOR_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.PROPERTY_ADDRESS_FULL. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_CHANGE_FREQ. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.MAILING_ADDRESS_FULL. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.OWNER_2_FIRST_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_LOAN_PURPOSE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_NEXT_RESET_DATE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ASSIGNED_LENDER. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_INDEX_TYPE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_LENDER_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.COUNTY_NAME. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_INTEREST_RATE_TYPE. It cannot be used as a key. base.py:1114
DEBUG Mixed Alphanum/Non-Alphanum values detected in column COMPARING_UNUSEDDATA_DEV_TABLE.FIRST_MTG_ARM_INITIAL_RESET_DATE. It cannot be used as a key. base.py:1114
INFO [Snowflake] Schema = {'JUNIOR_MTG_SUBORDINATE_TYPE': Text(_notes=[]), 'JUNIOR_MTG_ORIGINATION_DATE': Date(_notes=[], precision=6, rounds=False), 'REFERENCENR': Text(_notes=[]), 'PROPERTY_CITY': String_VaryingAlphanum(_notes=[]), 'OWNER_2_LAST_NAME': Text(_notes=[]), 'FIRST_MTG_ARM_CALCULATION_CHANGE': Decimal(_notes=[], precision=5), schema.py:12
'JUNIOR_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT': Decimal(_notes=[], precision=5), 'FIRST_MTG_TERM': Decimal(_notes=[], precision=5), 'UNIQUE_ID': String_VaryingAlphanum(_notes=[]), 'LANDUSEDESCRIPTIONID': Decimal(_notes=[], precision=0), 'FIRST_MTG_ORIGINATION_DATE': Date(_notes=[], precision=6,
rounds=False), 'JUNIOR_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_FIXED_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'OWNER_OCCUPIED_INDICATOR': String_VaryingAlphanum(_notes=[]), 'PROPERTY_ZIP_CODE': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_MODIFIED_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_INTEREST_RATE':
Decimal(_notes=[], precision=0), 'PROPERTY_ADDRESS_FULL': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_FREQ': Text(_notes=[]), 'FIRST_MTG_LOAN_TYPE': String_VaryingAlphanum(_notes=[]), 'ESTIMATED_EQUITY': String_VaryingAlphanum(_notes=[]), 'DOWNPMTASSISTFGRANT_IND': String_VaryingAlphanum(_notes=[]), 'MAILING_ZIP_CODE':
String_VaryingAlphanum(_notes=[]), 'MAILING_ADDRESS_FULL': Text(_notes=[]), 'DATAPROVIDERID': Decimal(_notes=[], precision=0), 'MAILING_STATE': String_VaryingAlphanum(_notes=[]), 'COUNTYID': Decimal(_notes=[], precision=0), 'OWNER_2_FIRST_NAME': Text(_notes=[]), 'OWNER_1_LAST_NAME': String_VaryingAlphanum(_notes=[]), 'LASTUSED':
Text(_notes=[]), 'MAILING_CITY': String_VaryingAlphanum(_notes=[]), 'APN_UNFORMATTED': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_LOAN_PURPOSE': Text(_notes=[]), 'FIRST_MTG_VARIABLE_RATE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_NEXT_RESET_DATE': Text(_notes=[]), 'FIRST_MTG_ARM_CHANGE_INTERVAL': Decimal(_notes=[],
precision=5), 'FIRST_MTG_LTV': Decimal(_notes=[], precision=5), 'FIRST_MTG_ASSIGNED_LENDER': Text(_notes=[]), 'TAX_AMOUNT': Float(_notes=[], precision=7), 'FIPS': String_VaryingAlphanum(_notes=[]), 'OWNER_1_FIRST_NAME': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_INDEX_TYPE': Text(_notes=[]), 'JUNIOR_MTG_EQUITY_LOAN_IND':
String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_LENDER_NAME': Text(_notes=[]), 'FIRST_MTG_AMOUNT': Decimal(_notes=[], precision=0), 'COUNTY_NAME': Text(_notes=[]), 'APN_SEQUENCE_NUMBER': String_VaryingAlphanum(_notes=[]), 'PROPERTY_STATE': String_VaryingAlphanum(_notes=[]), 'MLG_SALE_PRICE': Decimal(_notes=[], precision=0),
'JUNIOR_MTG_AMOUNT': Decimal(_notes=[], precision=0), 'FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE': Decimal(_notes=[], precision=5), 'FIRST_MTG_INTEREST_RATE_TYPE': Text(_notes=[]), 'FIRST_MTG_ARM_INITIAL_RESET_DATE': Text(_notes=[]), 'FIRST_MTG_REFINANCE_LOAN_IND': String_VaryingAlphanum(_notes=[]), 'FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT':
Decimal(_notes=[], precision=5)}
DEBUG Running SQL (Snowflake-TL): base.py:179
DROP TABLE IF EXISTS "test_results_2024-01-31_22_53_44"
DEBUG Running SQL (Snowflake-TL): base.py:179
SKIP
DEBUG Testing for duplicate keys: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:230
INFO Validating that the are no duplicate keys in columns: ['unique_id'] for ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:243
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960
SELECT count() AS "total", count(distinct coalesce(cast("UNIQUE_ID" as string), '')) AS "total_distinct" FROM "COMPARING_UNUSEDDATA_PROD_TABLE"
DEBUG Collecting stats for table #1: ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:270
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960
SELECT sum("MLG_SALE_PRICE") AS "sum_mlg_sale_price", count(
) AS "count" FROM "COMPARING_UNUSEDDATA_PROD_TABLE"
DEBUG Collecting stats for table #2: ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:270
DEBUG Querying for different rows: ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:208
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT sum("MLG_SALE_PRICE") AS "sum_mlg_sale_price", sum("DATAPROVIDERID") AS "sum_dataproviderid", sum("FIRST_MTG_ARM_FIRST_CHANGE_MAX_PCT") AS "sum_first_mtg_arm_first_change_max_pct", sum("FIRST_MTG_TERM") AS "sum_first_mtg_term", sum("COUNTYID") AS "sum_countyid", sum("FIRST_MTG_ARM_CALCULATION_CHANGE") AS
"sum_first_mtg_arm_calculation_change", sum("FIRST_MTG_ARM_MAXIMUM_INTEREST_RATE") AS "sum_first_mtg_arm_maximum_interest_rate", sum("LANDUSEDESCRIPTIONID") AS "sum_landusedescriptionid", sum("TAX_AMOUNT") AS "sum_tax_amount", sum("FIRST_MTG_INTEREST_RATE") AS "sum_first_mtg_interest_rate", sum("FIRST_MTG_ARM_CHANGE_INTERVAL") AS
"sum_first_mtg_arm_change_interval", sum("FIRST_MTG_LTV") AS "sum_first_mtg_ltv", sum("FIRST_MTG_ARM_CHANGE_PERCENT_LIMIT") AS "sum_first_mtg_arm_change_percent_limit", sum("JUNIOR_MTG_AMOUNT") AS "sum_junior_mtg_amount", sum("FIRST_MTG_AMOUNT") AS "sum_first_mtg_amount", count() AS "count" FROM "COMPARING_UNUSEDDATA_DEV_TABLE"
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960
SELECT * FROM ....
INFO Validating that the are no duplicate keys in columns: ['unique_id'] for ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:243
DEBUG Done collecting stats for table #1: ('COMPARING_UNUSEDDATA_PROD_TABLE',) joindiff_tables.py:306
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT count(
) AS "total"....
"COMPARING_UNUSEDDATA_DEV_TABLE"
DEBUG Testing for null keys: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:252
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) base.py:960
SELECT "UNIQUE_ID" FROM "COMPARING_UNUSEDDATA_PROD_TABLE" WHERE ("UNIQUE_ID" IS NULL)
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT "UNIQUE_ID" FROM "COMPARING_UNUSEDDATA_DEV_TABLE" WHERE ("UNIQUE_ID" IS NULL)
22:53:50 DEBUG Counting exclusive rows: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:372
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT count(*) FROM ....
22:53:52 DEBUG Done collecting stats for table #2: ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:306
DEBUG Counting differences per column: ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) joindiff_tables.py:346
DEBUG Running SQL (Snowflake): ('COMPARING_UNUSEDDATA_PROD_TABLE',) <> ('COMPARING_UNUSEDDATA_DEV_TABLE',) base.py:960
SELECT ....
22:55:26 DEBUG Running SQL (Snowflake-TL): base.py:179
CREATE TABLE IF NOT EXISTS "test_results_2024-01-31_22_53_44" .....
DEBUG Running SQL (Snowflake-TL): base.py:179
SKIP
DEBUG Running SQL (Snowflake-TL): base.py:179
INSERT INTO "test_results_2024-01-31_22_53_44" .... AS LIMITED_SELECT LIMIT 50000000
22:56:46 DEBUG Running SQL (Snowflake-TL): base.py:179
SKIP
[1] 99 segmentation fault data-diff --conf /c/Users/Vitalii/WorkRepos/data_diff_configs/datadiff.toml
If possible, please paste these as text, and not a screenshot.

Describe the environment

Describe which OS you're using, which data-diff version, and any other information that might be relevant to this bug.

I provided a bit shortened version of logs, because it goes beyond character limit

I have to add that comparing tables are around 50mil rows each, it works as expected with smaller size

Hey There!

Thanks again for providing the context. I want to apologize upfront that I thought @dlawin had capacity to investigate this but he doesn't. I don't have enough context to help you further either. I'm happy to evolve this conversation with you if you're interested in trying out Datafold Cloud as we use a different algorithm and don't have this hanging timeout issue like you're experiencing.

OR we welcome community pull requests if you have capacity to dig deeper on your own.

I am now thinking that this is due to the fact, that it was trying to put all the diff data in my terminal too, I think with the stats flag it should be fixed but I have no opportunity to test it right now.

It works with stats flag just fine. Hope you can add it somewhere in documentation.

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.