Question - Is there any benchmarking `ClassImbalance.jl`'s ability to handle datasets of different sizes?

Question

Question - Is there any benchmarking `ClassImbalance.jl`'s ability to handle datasets of different sizes?

Opened this issue 5 years ago · 1 comments

Describe the bug

This is just a question about Classimbalance.jl's ability to handle different size datasets. I was working with the python imbalance-learn package, and it keeps crashing when I give it a dataset of more than 2-3 million rows. In the case of imbalanced data, this is to be expected since it takes so many false examples to get a positive one. I can find creative ways to "thin" the dataset, but I was just wondering if there were any tests on how the julia package handles larger datasets?

Thanks.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior

Screenshots

Desktop (please complete the following information):

OS: [e.g. iOS] - Ubuntu 18.04 LTS x64
Browser [e.g. chrome, safari] Firefox,
Version [e.g. 22] 71

Smartphone (please complete the following information):

Device: [e.g. iPhone6]
OS: [e.g. iOS8.1]
Browser [e.g. stock browser, safari]
Version [e.g. 22]

NA
Additional context

Answer 1 · 2019-12-22T22:17:04.000Z

We currently don't have any benchmarks, but a pull request to add some benchmarks would be welcome!

I think there is room to improve the performance of this package.