[Feature]: Clustering: Optimal k
bdiptesh opened this issue · 0 comments
bdiptesh commented
Is your feature request related to a problem? Please describe.
A clustering module to cluster any given data (categorical/continuos/ordinal) and returns optimal clustering solution.
Describe the solution you'd like
Compute optimal clustering solution using gap-statistic.
Methods:
- First SE
- Maximum Gap
Expected input(s)
df: pandas.DataFrame
x_var: List[str]
max_cluster: int
method: Union[str]
Expected output(s)
opt_k
Additional context
No response
Acceptance criteria
Integration tests:
- Categorical variables only
- Continuos variables only
- Ordinal variables only
- Combination of categorical/ordinal/continuos
Version
v0.4.0