Does GPBoost support more than two levels?
Closed this issue · 1 comments
Dear Fabio,
not sure whether this is the right place to ask and sorry if this is reported elsewhere, but does GPBoost support modeling data with more than two levels (e.g., students nested within schools nested within regions)?
Kind regards,
Dirk
Thanks for your interest in GPBoost!
Yes, it does. You can have as many random effects as you like (crossed and/or nested). If you have more than two levels, you need to do the enumeration of the low level random effects with the helper function get_nested_categories
twice. Below is demo code for how this can be done in R (see also the demos in R or Python for the use of get_nested_categories
with only two levels):
region <- c(1,1,1,1,2,2,2,2)
school <- c(1,1,2,2,1,1,2,2)
student <- c(1,2,1,2,1,2,1,2)
school_nested <- get_nested_categories(region, school)
student_nested <- get_nested_categories(school_nested, student)
group_data <- cbind(region, school_nested, student_nested)
gp_model <- GPModel(group_data = group_data, y = y, X = X, likelihood = "gaussian")
## continue with GPBoost algorithm...
## or for a GLMM
gp_model <- fitGPModel(group_data = group_data, y = y, X = X,
likelihood = likelihood, params = list(std_dev = TRUE))
> group_data
region school_nested student_nested
[1,] 1 1 1
[2,] 1 1 2
[3,] 1 2 3
[4,] 1 2 4
[5,] 2 3 5
[6,] 2 3 6
[7,] 2 4 7
Note that for multiple random effects, nelder_mead
can sometimes be faster. If speed is an issue, I suggest you also try
gp_model <- GPModel(group_data = group_data, y = y, X = X, likelihood = "gaussian")
gp_model$set_optim_params(params = list(optimizer_cov = "nelder_mead"))
## continue with GPBoost algorithm...
## or for a GLMM
gp_model <- fitGPModel(group_data = group_data, y = y, X = X,
likelihood = "gaussian", params = list(std_dev = TRUE, optimizer_cov = "nelder_mead"))