SchlossLab/mikropml

NaN as character causes some problems

Closed this issue · 3 comments

When running mikropml through any model, it might change depending on the dataset, positive predictive value, negative predictive value and F1 score might come as NaN which is assigned as a character. When combining each outcome of the model iteration to get performance in a tibble, dplyr::bind_rows() shows an error that double and character can't be combined.

Error backtrace is attached, below.

crop

Hi @sskoldas, can you explain a bit more about how you're getting this error? I see it looks like you're calling mikropml with a Makefile. Can you share the R code that got you here?

NaN values are numerics (verified by running class(NaN)), therefore they can be combined with doubles by dplyr::bind_rows(). Maybe there are actually NA_character_ values in your performance tibbles? Or are there literally characters like "NaN" that need to be changed to NaN with a mutate() & if_else()?

I think, there are characters like "NaN" because the error is gone when I run this:
performance <- iterative_run_ml_results %>% lapply(pluck, "performance")
map_dfr(performance, ~mutate(.x, across(where(is.character),as.logical))) %>% write_tsv(glue("{root}_performance.tsv"))

Is your dataset very small and/or have highly imbalanced outcomes? In those cases the performance may be so bad that it can't be calculated. (Related to #311)