Fugue support for extra helper functions from core
fdosani opened this issue · 2 comments
fdosani commented
Currently there are some helper functions as part of the core Pandas code which I think are generally very helpful.
We need to spend some time exploring those and seeing which ones can be mirrored/included via the Fugue implementation.
This is just a list of most of the functions. Not all will make sense to move over. But we should investigate which ones make sense to:
- df1_unq_columns (#217)
- df2_unq_columns (#217)
- intersect_columns (#217)
- all_columns_match (#219)
- all_rows_overlap (#244)
- count_matching_rows (#294)
- intersect_rows_match
- matches (is_match)
- subset
- sample_mismatch
- all_mismatch
- columns_equal
- compare_string_and_date_columns
Need to look into a bit more. low prio for now.
- get_merged_columns
- temp_column_name
- calculate_max_diff
- generate_id_within_group
fdosani commented
@goodwanghan @kvnkho FYI, no pressure to contribute, but something in our backlog I'm thinking to ensure full parity in terms of function etc.
goodwanghan commented
Sounds good, let's chat about it