florianhartig/DHARMa

DHARMa: residuals and predictor do not have the same length.

Murphorado opened this issue · 7 comments

Hi Florian,

Hope you are well. After diagnosing the following linear mixed model with the DHARMa package:

fed_mod.cat.tree <- lmer(fed~best_site_avg_temps+deviations+log1p(caterpillars_per_tree)+site_avg_fbb+fbb_deviations+(1|year)+(1|site)+(1|ring), fed_best_window.cat.tree, REML=TRUE)

I got the following output:

Rplot

I don't think this indicates the model is hugely problematic, however I also wanted to plot the residuals against specific predictors from my dataset. When I attempt to do this - plotResiduals(simulationOutput, form = best_site_avg_temps) - I am met with the following error message:

Error in ensurePredictor(simulationOutput, form) :
DHARMa: residuals and predictor do not have the same length. The issue is possibly that you have NAs in your predictor that were removed during the model fit. Remove the NA values from your predictor.

I have removed all NA values from the "best_site_avg_temps" column within the "fed_best_window.cat.tree" dataset yet I am still getting the same error message. I am not sure why this is. Could it be something to do with the fact I have a nested experimental design, with different first egg dates (my response variable - fed) within the same site, therefore corresponding to the same value of "best_site_avg_temps"?

Any help would be much appreciated.

All the best,
Murphy

Update: I averaged across "site_year" to check the nested design wasn't a problem. I got a much better model fit but I'm still getting the error message when I try to plot the residuals against the predictor variables.

Hello,

this is very likely due to the fact that the model will remove observations if there is an NA in ANY of the predictors, not just in your focal predictor.

The easiest way to get the right predictors is to to run

x = model.frame(fed_mod.cat.tree)

which should extract the data that is used on your model, and then use

x$best_site_avg_temps

as your predictor.

Ok - thanks Florian!

I'm having the same issue, but it's trying to calculate the residuals for my fixed factors (Width, Sex and Season) My model looks like this:

par.full <- glm(Paramikrocytos ~ Width + Sex + Site/Season,
data = crab, family = "binomial")

I have no NA values in my dataset. I had no issues getting scaled quantile residuals, checking normality and zero inflation. This is what I try to run before the error message:
plotResiduals(Par.qr,
form = crab$Width,
xlab = "Rank-transformed Width")