Test multi-outcome datasets

Question

Test multi-outcome datasets

Closed this issue 4 years ago · 2 comments

One of our JOSS reviewers stated that run_ml() "doesn't seem to be able to support multi-label classifications." (openjournals/joss-reviews#3073 (comment))

We include a multi-outcome dataset (otu_mini_multi) in the package and results from running it with glmnet (otu_mini_multi_results_glmnet), but I just noticed that we never wrote any unit tests for it!

Answer 1 · 2021-04-30T19:56:53.000Z

Multi-label classification is when you can assign multiple labels to each sample/instance. I don't think caret allows this functionality so I think it's probably out of the scope of this package. We can add it as a limitation to the manuscript and vignette. Thoughts @kelly-sovacool @BTopcuoglu?

I checked, and it doesn't seem like tidymodels supports it. Someone asked about it, but it doesn't look like anyone answered:
https://community.rstudio.com/t/tidymodels-multi-label-classification/102478

I can add in a test for a multi-outcome dataset either way, but I don't think this is what the reviewer was referencing.

Answer 2 · 2021-04-30T20:08:40.000Z

@zenalapp Ohhh I see, I confused the similar terminology.

Let's note the limitation in one of the vignettes, maybe the intro one? I don't know that we need to add it to the paper since it's really out of scope and not even caret or tidymodels support it.

But yes either way we do actually need a unit test for the multi-outcome dataset.