scikit-learn-contrib/DESlib

Performance of DS methods not same when the pool is changing

jayahm opened this issue · 4 comments

Hi,

I used the same DS method, but with different pools.

What happened was, the DS methods didn't show the same performance in term of ranking.

For example, DS Method X, was the best on Pool A but not Pool B.

That is normal, the performance will always depend on the pool and there will always be change in performance by changing the pool. That's why it is important to estimate an average performance by running multiple simulations like cross-validation or by doing multiple hold-out splits. And measure whether the difference in performance is statistically significant or not with the proper statistical tools.

OK. But, how do we reach conduction, which method is the best?

I ran both heterogeneous and homogenous pool of classifiers, but, I cannot make a conclusion which is better.

Also, I saw many papers that proposed new DS methods where can their method can outperform other existing methods eventhough the nature of datasets used were diverse. How could a particular method show superiority in this case?

Any idea on what statistical test can be used when this happens?

You should check the machine learning literature about model selection and how to compare learning algorithms for that. Some suggesting readings are:

Since this space is to report bugs and discuss changes/new features in the library and not about the comparison of models or how to properly use statistical tests in mahcine learning I'm closing this issue.

Thanks! I appreciate that