hfgolino/EGAnet

Leiden algorithm doesn't run in bootEGA using riEGA

Closed this issue · 3 comments

It works fine when it's a regular EGA and not riEGA


Capture

I believe this issue is related to the objective function default of the Constant Potts Model in the Leiden algorithm. This objective function has a different effective range than modularity with 1 being the absolute highest parameter it can take (the parameter corresponds to a maximum edge weight), resulting in singleton communities (hence breaking in places).

There's some work that needs to be done on determining the most appropriate parameter for general application with the Constant Potts Model.

The algorithm would likely work if setting algorithm.args = list(objective_function = "modularity")

If wanting to use the CPM objective function, then you could set algorithm.args = list(resolution_parameter = 0.10) as test (it should produce more usable results and not error; if it does, try even smaller). With CPM, resolution_parameter = 0 guarantees a unidimensional solution (no matter the network)

In future versions of the package, ellipse (...) will be used to avoid using model.args and algorithm.args (although functions will still accept these arguments for legacy)

Keeping this issue open as a reminder to check again now that I'm working on riEGA

Linking source for CPM resolution parameter for posterity

Traag, V. A., Van Dooren, P., & Nesterov, Y. (2011). Narrow scope for resolution-limit-free community detection. Physical Review E, 84(1), 016114. https://doi.org/10.1103/PhysRevE.84.016114

Thanks, Alex. I will let you know next time I try and run this. I will try follow your suggested workarounds and let you know how I make out. I do recall that even when I have got the Leiden algorithm to run, the result is singletons, so matches your thinking on this.

Here's a workable update using Leiden with riEGA in bootEGA:

# Load package
library(EGAnet)

# Bootstrap EGA
boot <- bootEGA(
  data = wmt2[,7:24],
  type = "parametric", iter = 500,
  algorithm = "leiden", EGA.type = "riEGA",
  plot.typicalStructure = FALSE
)

# Print/summary
boot

# Plot
plot(boot)

# Item Stability
itemStability(boot, IS.plot = TRUE)

With output:

> boot <- bootEGA(
+   data = wmt2[,7:24],
+   type = "parametric", iter = 500,
+   algorithm = "leiden", EGA.type = "riEGA",
+   plot.typicalStructure = FALSE
+ )
The random-intercept model converged. Wording effects likely. Results are only valid if data are unrecoded.
/ [=================] 100% elapsed: 35s ~remaining:  0s
> # Print/summary
> boot
Model: GLASSO (EBIC with gamma = 0.25)
Correlations: auto
Algorithm:  Leiden with Constant Potts Model
Unidimensional Method:  Louvain (Most Common for 1000 iterations)

----

EGA Type: riEGA 
Bootstrap Samples: 500 (Parametric)
             
            0
Frequency:  1

Median dimensions: 0 [0, 0] 95% CI
> # Plot
> plot(boot)
> # Item Stability
> itemStability(boot, IS.plot = TRUE)
Error: The 'structure' provided contains all missing values. Check the empirical structure. If you did not provide an empirical structure, then check your `bootEGA` output:`your_output$EGA$wc`

If all memberships are `NA`, then your network may be empty or different settings need to be applied in the community detection algorithm of `bootEGA`

Notice the error which is "as expected" because the membership from the empirical EGA is all NA. Shifting to objective_function = "modularity" we get all output:

> boot <- bootEGA(
+   data = wmt2[,7:24],
+   type = "parametric", iter = 500,
+   algorithm = "leiden", EGA.type = "riEGA",
+   objective_function = "modularity",
+   plot.typicalStructure = FALSE
+ )
The random-intercept model converged. Wording effects likely. Results are only valid if data are unrecoded.
/ [====================] 100% elapsed: 37s ~remaining:  0s
> # Print/summary
> boot
Model: GLASSO (EBIC with gamma = 0.25)
Correlations: auto
Algorithm:  Leiden with Modularity
Unidimensional Method:  Louvain (Most Common for 1000 iterations)

----

EGA Type: riEGA 
Bootstrap Samples: 500 (Parametric)
                                   
                3     4     2     5
Frequency:  0.688 0.292 0.014 0.006

Median dimensions: 3 [2.02, 3.98] 95% CI
> # Plot
> plot(boot)
> # Item Stability
> itemStability(boot, IS.plot = TRUE)
EGA Type: riEGA 
Bootstrap Samples: 500 (Parametric)

Proportion Replicated in Empirical Dimensions:

 wmt1  wmt2  wmt3  wmt4  wmt5  wmt6  wmt7  wmt8  wmt9 wmt10 
0.306 0.846 0.846 0.134 0.902 0.940 0.956 0.960 0.856 0.860 
wmt11 wmt12 wmt13 wmt14 wmt15 wmt16 wmt17 wmt18 
0.312 0.350 0.230 0.654 0.126 0.162 0.312 0.416

This dataset is not actually appropriate for riEGA but works nonetheless. Results are fully reproducible (if you run this same code, then you will get these exact same results due to seeds set in C++ implemented in this update: 38963d9)

Closing this issue as it will be resolved with the next branch merge from https://github.com/hfgolino/EGAnet/tree/major-changes