stekhoven/missForest

Error: NA not permitted in predictors

Opened this issue · 3 comments

Hi,

I am running into this error. I also found a solution to it that I did not find online yet when searching. Here is a reproducible example:

# attempt 1 with tibble does not work at all
d <- tibble(
  var = rnorm(n = 100),
  var2 = rbinom(n = 100, size = 100, p = 0.1),
  var3 = rnorm(n = 100),
  var4 = rnorm(n = 100),
)

d$var2 <- as.factor(d$var2)
dmiss <- prodNA(d)
dimprf <- missForest(
  dmiss, 
  variablewise = TRUE, 
  ntree = 1000, 
  decreasing = TRUE)


d <- tibble(
  var = rnorm(n = 100),
  var2 = rbinom(n = 100, size = 100, p = 0.1),
  var3 = rnorm(n = 100),
  var4 = rnorm(n = 100),
)

d$var2 <- as.factor(d$var2)
dmiss <- prodNA(d)
dimprf <- missForest(
  as.matrix(dmiss), 
  variablewise = TRUE, 
  ntree = 1000, 
  decreasing = TRUE)
 
# attempt 2 with matrix throughs error mentioned in title
d <- tibble(
  var = rnorm(n = 100),
  var2 = rbinom(n = 100, size = 100, p = 0.1),
  var3 = rnorm(n = 100),
  var4 = rnorm(n = 100),
)

# only converting to data.frame before works
d <- as.data.frame(d)
d$var2 <- as.factor(d$var2)
dmiss <- prodNA(d)
dimprf <- missForest(
  dmiss, 
  variablewise = TRUE, 
  ntree = 1000, 
  decreasing = TRUE)

missForest is not reliable with tibbles, we will look into that.

Thanks for opening this issue @HenrikEckermann. I was using data frames and everything was working fine. Then at some point in my script I started using a function that converted my data frame to a tibble, and missForest stopped working (Error in { : task 1 failed - "NA not permitted in predictors"). I could not figure out the source of this error and was totally puzzled, spent a lot of time trying to troubleshoot variables individually. So thanks for providing the workaround of converting to a data frame.

Google seems to bring people to #14, but they should really come here for the solution.