JenniNiku/gllvm

Fourth corner model: "Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric"

Opened this issue · 7 comments

Thank you for a useful package. It promises to offer the gains in efficiency we've been looking for.

I have successfully fitted a probit model without traits, but when I try the 4th corner model I get "Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric"

My traits (body length in 3 bins) are binary and I have ensured that they are stored in a dataframe as numeric variables.

traits <- as.data.frame(apply(traits, 2, function(x) as.numeric(x)))
X.data <- data.frame(basin=as.character(meta$basin), time=as.numeric(meta$days))
mod <- gllvm(y=taxa, X=X.data, formula= ~basin*time, family=binomial(link = "probit")) #Works fine
mod.tr <- gllvm(y=taxa, X=X.data, TR=traits,formula= y ~ (basin*time) + (basin*time):(len.sma + len.med + len.lar), family=binomial(link = "probit")) #Produces error

Any advice on what might be causing the issue?

Great that you're trying to use the package! Would you mind providing a reproducible example, possibly with some simulated data? Without a reproducible example it's more difficult to find out what's going on.

Hi all,

I've actually seen this issue with my own binomial('probit') adventures, but I've had issues recreating this with spider data. I get the same error with both the CRAN version, and on the dev version I get a elf_dynamic_array_reader.h(64) tag not found error that crashes R.

Interestingly, when fitting negative binomial models to count response data (allowing NAs in the responses), I would get the same Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric with the CRAN version. Switching to the dev version completely fixed the issue here, though.

I honestly think some of us might just need to share obscured versions of our datasets to really figure out what's going on, as it's not apparent to me what properties of my data could be simulated to reproduce the error.

Does changing starting.val to "zero" solve your issues?

Sorry for the delay, Bert! For more detail, this model in question doesn't have any latent variables--does starting.val still affect non-LV models?

Sorry for the delay, Bert! For more detail, this model in question doesn't have any latent variables--does starting.val still affect non-LV models?

Yes.

Hi Bert--here's my delayed update, in short, it doesn't seem that changing the starting value has an effect. I've tried both 'zero' and 'random', but both still give elf_dynamic_array_reader.h(64) tag not found errors that lead R to abort.

Here's the model I tried:

test <- gllvm(Y,
                    X,
                    TR,
                    formula = ~ cat_6 + cat_2 + log(num),
                    family = binomial('probit'),
                    num.lv = 0,
                    gradient.check = T,
                    control = list(reltol = 1e-16),
                    control.start = list(starting.val = 'zero',
                                         n.init = 5))

Where cat_6 is a six-level categorical variable, cat_2 has two levels, and num is a numeric variable. TR is a spoofed matrix as suggested by @tanharri in #109 to get the community-wide response--perhaps something's breaking down on this end? The same model works fine when the spoofed TR is removed.

As mentioned previously, my negative binomial model can handle the spoofed matrix approach just fine (same formula, minus the numeric variable), but the dev version is needed for this.

Hi Bert--here's my delayed update, in short, it doesn't seem that changing the starting value has an effect. I've tried both 'zero' and 'random', but both still give elf_dynamic_array_reader.h(64) tag not found errors that lead R to abort.

Here's the model I tried:

test <- gllvm(Y,
                    X,
                    TR,
                    formula = ~ cat_6 + cat_2 + log(num),
                    family = binomial('probit'),
                    num.lv = 0,
                    gradient.check = T,
                    control = list(reltol = 1e-16),
                    control.start = list(starting.val = 'zero',
                                         n.init = 5))

Where cat_6 is a six-level categorical variable, cat_2 has two levels, and num is a numeric variable. TR is a spoofed matrix as suggested by @tanharri in #109 to get the community-wide response--perhaps something's breaking down on this end? The same model works fine when the spoofed TR is removed.

As mentioned previously, my negative binomial model can handle the spoofed matrix approach just fine (same formula, minus the numeric variable), but the dev version is needed for this.

Note that using "n.init" with "starting.val=zero"" is pointless since it will give the same result on every fit.

Sorry, I cannot draw any conclusions from this. My suspicion is that the problem is highly dependent on the dataset in question. Even when I take the same approach; a different dataset but with a categorical variable with two levels and a numerical variable, I still cannot reproduce the issue. So, it will require a fully reproducible example, with simulations or a dataset, for me to look into this.