Inconsistent performance of DS methods

Question

Inconsistent performance of DS methods

jayahm opened this issue 4 years ago · 4 comments

Hi,

I run some experiments on multiple datasets using several DS methods.

I just got confused that why the performance of each DS methods is not consistent?

For example, sometimes rank X and sometimes rank Y, sometimes rank Z (if the best, not always the best om all datasets).

This made m hard to make a conclusion.

Is this normal?

Answer 1 · 2020-09-24T23:27:17.000Z

Yes, it is normal. It is the non-free lunch theorem, the best model will depend on the dataset.

That's why having the appropriate simulation like using cross-validation to estimate performance averages, as well as using proper statistical tests for comparison between multiple machine learning models, is very important.

Answer 2 · 2020-09-25T01:22:38.000Z

Can you explain more on the cross-validation part and statistical test?

I mean, not on how to do it. But, on how these two can be helpful for analysis in the case the performance is not consistent?

Answer 3 · 2020-09-25T02:03:10.000Z

Unfortunately, I can't since it is a very long subject, with plenty of nuances to cover, and here is not the place for that (especially since it is also completely out of the scope from this project). I can however suggest some readings:

Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning: https://arxiv.org/pdf/1811.12808
Statistical comparisons of classifiers over multiple data sets: http://www.jmlr.org/papers/volume7/demsar06a/demsar06a.pdf
Japkowicz, Nathalie, and Mohak Shah. Evaluating learning algorithms: a classification perspective. Cambridge University Press, 2011.

Answer 4 · 2020-09-26T07:21:05.000Z

Thanks! I appreciate that