capitalone/datacompy
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
PythonApache-2.0
Issues
- 2
- 3
Please add Snowpark support
#290 opened by achrusciel - 0
Drop support for Python 3.9 by the EoY.
#342 opened by fdosani - 5
Fixes for numpy 2.0 support
#323 opened by fdosani - 3
Add column names to column summary
#334 opened by MariusMerkleQC - 3
- 6
pandas df with datetime.date cols returns false differences if ignore_case=True
#327 opened by sundeep-longledge - 0
switch to ruff for linting and all the things.
#289 opened by fdosani - 9
- 0
Polars v1 support
#324 opened by fdosani - 3
- 7
SparkSQLCompare only checks for instance "pyspark.sql.DataFrame", but not for instance "pyspark.sql.connect.dataframe.DataFrame"
#320 opened by achrusciel - 4
SparkSQLCompare only works when grpcio and protobuf are installed manually outside of datacompy
#319 opened by shreya-goddu - 1
- 2
- 2
SparkCompare fails on Databricks DBR Spark clusters with Unity Catalog enabled
#312 opened by lingeshr-db - 10
datacompy v0.12 spark sample with 5 rows only takes more than a minute to execute on databricks
#300 opened by satniks - 14
Are there plans to support Python 3.12.1?
#262 opened by RicardoEscobar - 1
- 3
- 2
Fugue support for extra helper functions from core
#214 opened by fdosani - 2
- 1
[Discussion] Deprecate the native Spark implementation in favour of Fugue or Pandas on Spark
#274 opened by fdosani - 0
Just going to add a note here for future, currently seeing a small difference in pandas vs spark report sample rows when there are rows only in one dataframe.
#288 opened by fdosani - 1
SparkCompare [PARSE_SYNTAX_ERROR] if a non-join column name contains unicode symbols
#284 opened by kformanowicz-dotdata - 2
SparkCompare [PARSE_SYNTAX_ERROR] if column name contains unicode symbols
#280 opened by kformanowicz-dotdata - 0
- 8
Add list of dissimilar columns to report
#235 opened by janinebp - 4
- 3
Abstract base class for native Compare functionality
#260 opened by fdosani - 12
Python 3.11 support
#227 opened by fdosani - 0
edgetest is broken and needs some investigating.
#267 opened by fdosani - 9
Issue in writing report
#256 opened by rangav07 - 0
Snowflake and SQL support via Fugue
#264 opened by fdosani - 2
- 2
who can help make the result significantly
#255 opened by swloveydp - 1
- 4
Add mypy to the project
#247 opened by aguiddir - 2
confused about df_unq_rows
#243 opened by swloveydp - 6
- 3
No objects to concatenate issue with Fugue
#218 opened by fdosani - 3
The intersection logic of Compare has problems.
#221 opened by goodwanghan - 4
Datacompare for Date field is not working
#230 opened by RRajavel - 1
SparkCompare() not working for dask - dropDuplicates
#233 opened by hb0313 - 4
- 2
Speed up spark unit tests
#223 opened by krishnanravi - 0
Pandas 2.0 support
#213 opened by fdosani - 1
documentation about fugue functionality
#203 opened by fdosani - 0
modernize docs
#204 opened by fdosani - 2
Fugue Phase 2 functionality
#206 opened by fdosani