bcbi/ClassImbalance.jl

Question - Is there any benchmarking `ClassImbalance.jl`'s ability to handle datasets of different sizes?

Opened this issue · 1 comments

Describe the bug

This is just a question about Classimbalance.jl's ability to handle different size datasets. I was working with the python imbalance-learn package, and it keeps crashing when I give it a dataset of more than 2-3 million rows. In the case of imbalanced data, this is to be expected since it takes so many false examples to get a positive one. I can find creative ways to "thin" the dataset, but I was just wondering if there were any tests on how the julia package handles larger datasets?

Thanks.

To Reproduce
Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

Screenshots

Desktop (please complete the following information):

  • OS: [e.g. iOS] - Ubuntu 18.04 LTS x64
  • Browser [e.g. chrome, safari] Firefox,
  • Version [e.g. 22] 71

Smartphone (please complete the following information):

  • Device: [e.g. iPhone6]
  • OS: [e.g. iOS8.1]
  • Browser [e.g. stock browser, safari]
  • Version [e.g. 22]

NA
Additional context

We currently don't have any benchmarks, but a pull request to add some benchmarks would be welcome!

I think there is room to improve the performance of this package.