multiclass dataset
Closed this issue · 7 comments
your proposed hybrid approach is it applicable for multiclass dataset?
Oversampling more than two classes is not currently possible.
Thanks for your quick answer i am working on a multiclass imbalanced dataset and i am trying to balance it wit SCUT https://www.researchgate.net/publication/301453161_SCUT_Multi-Class_Imbalanced_Data_Classification_using_SMOTE_and_Cluster-based_Undersampling
But i've faced some problem in implementation because no further details about the strategy implemented .
If you're ready we can collaborate together and propose a new hybrid technique to overcome multiclass data
Thank you, but I'm not interested at the moment. In the future, multi-class support may be added to k-means SMOTE.
Multi-class support is now available with release 0.1.0.
Thanks Felix where can i find it and do you have any description of your extension
@rawiasammout, you can simply pip install kmeans-smote
as shown in the readme. The latest version (0.1.0) implements the multi-class case.
To use kmeans-smote on multi-class data, there is nothing special to consider, just call fit_sample
on your multi-class data and tune the algorithm's hyperparameters (see docs) to your needs. There is also a small test implemented to oversample a multi-class toy dataset.
From an algorithmic perspective, the second step of the algorithm (see readme > about) is repeated for each minority class. The paper referenced in the readme should provide you with a thorough explanation of how the algorithm works for the binary case - the multi-class case is only a small modification.