How do I use this package with image data?
apatra9 opened this issue · 5 comments
I want to use this kind package with image data, how do I take input? Would converting it into a numpy array and feeding it work? I am also having issue with the 'Data' and 'Target' labels
Hi @apatra9, all the functionalities implemented in the package are for binary/multiclass classification problems, and operate on feature vectors (data) and corresponding class labels (target). Applying it to image data is not recommended and is definitely not a proper use-case.
If your problem in hands is a problem of classifying images, and one of the classes has significantly lower number of elements than the other, you can use this package in the following way.
First, you need to represent the images as relatively low dimensional feature vectors (10-100-1000 dimensions) by extracting various image descriptors or using autoencoders. Once you have the feature vector representation of the images and you also have the corresponding class labels, you can feed them to the oversampling techniques implemented in the package to get a balanced dataset. That balanced dataset can be used to train a classifier, which can be expected to give better performance than a classifier trained on the imbalanced dataset.
@gykovacs I have used the LLE smote on image data. Basically, LLE maps high dimension data into lower dimensions. So, the LLE_smote brings the high dimension image data into a lower dimension and then applies smote, after doing this it maps the oversampled data back to the original dimension. I feel this is useful. Let me know if I can create a PR to add that example.
@sakethbachu Sounds very interesting, sure, please go ahead, and add the example!
@sakethbachu Sounds very interesting, sure, please go ahead, and add the example!
Will create a PR by this weekend, thankyou :)