Warning during using hashdiff for VARCHAR type values in POSTGRES.
EnCeT opened this issue · 2 comments
Despite using a supported data type, while using hashdiff, I receive the warning '[PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling.' I couldn't find information about this in the documentation, only that VARCHAR is supported for PostgreSQL.
documentation
data-diff "postgresql://postgres:Password1@localhost:5432/postgres" test2 "postgresql://postgres:Password1@localhost:5432/postgres" test -k id -c email -v -a hashdiff -d
01:20:09 INFO [PostgreSQL] Starting a threadpool, size=1. base.py:1196
DEBUG Running SQL (PostgreSQL): base.py:959
SET TIME ZONE 'UTC'
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM
information_schema.columns WHERE table_name = 'test2' AND table_schema = 'public'
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT column_name, data_type, datetime_precision, numeric_precision, numeric_scale FROM
information_schema.columns WHERE table_name = 'test' AND table_schema = 'public'
INFO [PostgreSQL] Schema = {'id': ('id', 'integer', None, 32, 0), 'first_name': ('first_name', schema.py:12
'character varying', None, None, None), 'second_name': ('second_name', 'character varying',
None, None, None), 'last_name': ('last_name', 'character varying', None, None, None), 'emailx':
('emailx', 'character varying', None, None, None), 'email': ('email', 'character varying', None,
None, None), 'nulle1': ('nulle1', 'character varying', None, None, None), 'nulle2': ('nulle2',
'character varying', None, None, None)}
INFO [PostgreSQL] Schema = {'id': ('id', 'integer', None, 32, 0), 'first_name': ('first_name', schema.py:12
'character varying', None, None, None), 'second_name': ('second_name', 'character varying',
None, None, None), 'last_name': ('last_name', 'character varying', None, None, None), 'emailx':
('emailx', 'character varying', None, None, None), 'email': ('email', 'character varying', None,
None, None), 'nulle1': ('nulle1', 'character varying', None, None, None), 'nulle2': ('nulle2',
'character varying', None, None, None)}
DEBUG Available mutual columns: {'email', 'id', 'first_name', 'last_name', 'nulle1', 'emailx', __main__.py:489
'nulle2', 'second_name'}
INFO Diffing schemas... __main__.py:81
INFO Diffing using columns: key=('id',) update=None extra=('email',). __main__.py:518
INFO Using algorithm 'hashdiff'. __main__.py:519
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT * FROM (SELECT TRIM("email") FROM "test2") AS LIMITED_SELECT LIMIT 64
INFO [PostgreSQL] Schema = {'id': Integer(_notes=[], precision=0, python_type=<class 'int'>), schema.py:12
'email': Text(_notes=[])}
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT * FROM (SELECT TRIM("email") FROM "test") AS LIMITED_SELECT LIMIT 64
INFO [PostgreSQL] Schema = {'id': Integer(_notes=[], precision=0, python_type=<class 'int'>), schema.py:12
'email': Text(_notes=[])}
WARNING [PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling. hashdiff_tables.py:133
If encoding/formatting differs between databases, it may result in false positives.
WARNING [PostgreSQL] Column 'email' of type 'Text(_notes=[])' has no compatibility handling. hashdiff_tables.py:133
If encoding/formatting differs between databases, it may result in false positives.
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT min("id")::varchar, max("id")::varchar FROM "test2"
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT min("id")::varchar, max("id")::varchar FROM "test"
INFO Diffing segments at key-range: (1)..(3). size: table1 <= 2, table2 <= 2 diff_tables.py:294
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT "id"::varchar, "email"::varchar FROM "test2" WHERE ("id" >= 1) AND ("id" < 3)
DEBUG Running SQL (PostgreSQL): base.py:959
SELECT "id"::varchar, "email"::varchar FROM "test" WHERE ("id" >= 1) AND ("id" < 3)
INFO Diff found 0 different rows. hashdiff_tables.py:217
01:20:10 INFO Duration: 1.38 seconds. __main__.py:553
Same result i receive for columns with empty strings and null values:
data-diff "postgresql://postgres:Password1@localhost:5432/postgres" test2 "postgresql://postgres:Password1@localhost:5432/postgres" test -k id -c nulle2 -v -a hashdiff -d
I'm running data-diff version 0.9.17.
My testing tabels in PostgresSQL:
create table test (
id serial primary key,
first_name VARCHAR(50) NOT NULL,
second_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50)NOT NULL,
emailx VARCHAR(50),
email VARCHAR(50),
Nulle1 VARCHAR(20),
Nulle2 VARCHAR(20)
);
insert into test (id, first_name, second_name, last_name, emailx, email, Nulle1, Nulle2)
values
(1, 'Tisha', 'Annalise', 'Casserly', 'acasserly0@i2i.jp*', 'acasserly0@i2i.jp', null, ''),
(2, 'Joelynn', 'Bette', 'Samuels', 'bsamuels1@networksolutions.com*', 'bsamuels1@networksolutions.com', ' ', null);
insert into test2 (id, first_name, second_name, last_name, emailx, email, Nulle1, Nulle2)
values
(1, 'Tisha', 'Annalise', 'Casserly', 'acasserly0@i2i.jp*', 'acasserly0@i2i.jp', '', ''),
(2, 'Joelynn', 'Bette', 'Samuels', 'bsamuels1@networksolutions.com*', 'bsamuels1@networksolutions.com', null, NULL)
This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.