datafold/data-diff

Presto and Trinio connection string. Trinio connector missing the 'host'

hamer101 opened this issue · 2 comments

Running data-diff against the trinio service following the correct pattern results in
TypeError: Connection.__init__() missing 1 required positional argument: 'host'

The command:

data-diff \           
  "trino://postgres:Password1@localhost:8082/postgres/postgres" ip_table \  
  postgres.ip_table_2 \ 
  -k ID \
  -c IP \
  -d 

The trace:

12:46:31 ERROR    Connection.__init__() missing 1 required positional argument: 'host'                                                                                                                                                                         __main__.py:348
Traceback (most recent call last):
  File "/home/szymon/Dokumenty/data-diff_venv/bin/data-diff", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/data_diff/__main__.py", line 344, in main
    return _data_diff(
           ^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/data_diff/__main__.py", line 426, in _data_diff
    db1 = connect(database1, threads1 or threads)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/data_diff/databases/_connect.py", line 276, in __call__
    conn = self.connect_to_uri(db_conf, thread_count, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/data_diff/databases/_connect.py", line 204, in connect_to_uri
    db = cls(**kw, **kwargs)
         ^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/data_diff/databases/trino.py", line 44, in __init__
    super().__init__()
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/data_diff/databases/presto.py", line 178, in __init__
    self._conn = prestodb.dbapi.connect(**kw)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/szymon/Dokumenty/data-diff_venv/lib/python3.11/site-packages/prestodb/dbapi.py", line 52, in connect
    return Connection(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Connection.__init__() missing 1 required positional argument: 'host'

I'm running data-diff version 0.9.17.

Additionally, the documentation seem not to include the /<catalog>/<schema> structure (for both Presto and Trino), thus following the original pattern of trino://<username>:<password>@<hostname>:8080/<database> results in
ValueError: URI must specify 'schema'. Expected format: trino://<user>@<host>/<catalog>/<schema>

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment and it will be reopened for triage.