[FEATURE] Supporing Aggregation metrics for a group
theajay87 opened this issue · 0 comments
theajay87 commented
Is your feature request related to a problem? Please describe.
We have a use-case where we need to generate aggregated metrics like SUM, Mean and scannable metrics like MAX, MIN, MIN-LENGHT, MAX-LENGTH on a group defined on a column (or columns) in dataframe.
Describe the solution you'd like
Currently, the ScanShareableFrequencyBasedAnalyzer has only CountDistinct, Distinctness, Entropy, Uniqueness and UniqueValueRatio implementation. I would like to extend similar implementation for all other scannable and aggregation metrics so that each metrics can be computed at group level.
Describe alternatives you've considered
- One option is that i externally run the groupBy clause on Dataframe and split the dataframe based on group. Later, iterate over it and then keep calling Analyszer on each group.
Additional context
Add any other context or screenshots about the feature request here.