datafold/data-diff

Tightening of pydantic dependancy following release of v0.11.1

FChmiel opened this issue · 7 comments

Describe the bug
The recent release changed the dependency of pydantic from "1.10.12" to ">=1.10.12".

This is causing pipelines using data-diff to break, because of breaking changes with pydantic in version 2.

Would this version be more appropriate:

pydantic = ">=1.10.12, <2.0.0"

Appreciate the stance outlined here.

We don't use pydantic in our service and it feels peculiar for us to restrict a transitive dependency, but that may be my misunderstanding.

We don't use pydantic in our service

https://github.com/datafold/data-diff/blob/master/data_diff/dbt_config_validators.py#L3

Traceback (most recent call last):
  File "/tmp/myvenv/bin/data-diff", line 5, in <module>
    from data_diff.__main__ import main
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/__main__.py", line 18, in <module>
    from data_diff.dbt import dbt_diff
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/dbt.py", line 22, in <module>
    from data_diff.cloud import DatafoldAPI, TCloudApiDataDiff, TCloudApiOrgMeta
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/cloud/__init__.py", line 2, in <module>
    from data_diff.cloud.data_source import get_or_create_data_source
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/cloud/data_source.py", line 18, in <module>
    from data_diff.dbt_parser import DbtParser
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/dbt_parser.py", line 13, in <module>
    from data_diff.dbt_config_validators import ManifestJsonConfig, RunResultsJsonConfig
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/dbt_config_validators.py", line 6, in <module>
    class ManifestJsonConfig(BaseModel):
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/dbt_config_validators.py", line 7, in ManifestJsonConfig
    class Metadata(BaseModel):
  File "/tmp/myvenv/lib/python3.10/site-packages/data_diff/dbt_config_validators.py", line 8, in Metadata
    dbt_version: str = Field(..., regex=r"^\d+\.\d+\.\d+([a-zA-Z0-9]+)?$")
  File "/tmp/myvenv/lib/python3.10/site-packages/pydantic/fields.py", line 751, in Field
    raise PydanticUserError('`regex` is removed. use `pattern` instead', code='removed-kwargs')
pydantic.errors.PydanticUserError: `regex` is removed. use `pattern` instead

For further information visit https://errors.pydantic.dev/2.6/u/removed-kwargs

Experiencing the exact same issue. Can confirm data-diff 0.11.1 with the redshift extra is broken.

Same issue here (now with data-diff 0.11.1 and the snowflake extra). Adding pydantic = ">=1.10.12, <2.0.0" explicitly indeed seems to resolve the issue.

Same issue here (now with data-diff 0.11.1 and the snowflake extra). Adding pydantic = ">=1.10.12, <2.0.0" explicitly indeed seems to resolve the issue.

Yeah I would argue its annoying for me to have to restrict a transitive dependancy, so hoping we can restrict within this package.

This issue has been marked as stale because it has been open for 60 days with no activity. If you would like the issue to remain open, please comment on the issue and it will be added to the triage queue. Otherwise, it will be closed in 7 days.

Hi @FChmiel et al,

I'm sorry for the delay in responding. Thank you for trying out data-diff and for opening this issue!

We made a hard decision to sunset the data-diff package and won't provide further development or support. Diffing functionality will continue to be available in Datafold Cloud. Feel free to take it for a trial or contact us at support@datafold.com if you have any questions.

-Gleb