Question - Is there any benchmarking `ClassImbalance.jl`'s ability to handle datasets of different sizes?
Opened this issue · 1 comments
Describe the bug
This is just a question about Classimbalance.jl
's ability to handle different size datasets. I was working with the python imbalance-learn
package, and it keeps crashing when I give it a dataset of more than 2-3 million rows. In the case of imbalanced data, this is to be expected since it takes so many false examples to get a positive one. I can find creative ways to "thin" the dataset, but I was just wondering if there were any tests on how the julia package handles larger datasets?
Thanks.
To Reproduce
Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior
Screenshots
Desktop (please complete the following information):
- OS: [e.g. iOS] - Ubuntu 18.04 LTS x64
- Browser [e.g. chrome, safari] Firefox,
- Version [e.g. 22] 71
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
NA
Additional context
We currently don't have any benchmarks, but a pull request to add some benchmarks would be welcome!
I think there is room to improve the performance of this package.