hfgolino/EGAnet

Leiden algorithm in eganet::bootEGA requires specifying algorithm.args

Closed this issue · 1 comments

Hello and thanks for your work!

The leiden algorithm in bootEGA gave weird community solutions (NA / error in loops -other database) if no additional algorithm.args were provided. Perhaps a warning could be given if no specification is provided.

A R script with a reproductible example is attached to this issue (as csv)
bootEGA-leiden-example.csv
.

The default objective function is Constant Potts model ( "CPM"), and it needs to specify a resolution limit parameter.

  • If no algorithm.args are supplied it could fail with an error (other data set) or give NA in further bootstraps (this reproducible example).

  • If algorithm.args = list(objective_function = "modularity") the community solution approaches Louvain.

  • A useful resolution parameter was 0.05: algorithm.args = list(objective_function = "CPM", resolution_parameter = 0.05)

In the data set I'm working in, it gave a theoretically sound and stable community solution (same for [0.025, 0.06], then too many groups).

I was trying to further understand CPM here:
Traag, V. A., Van Dooren, P., & Nesterov, Y. (2011). Narrow scope for resolution-limit-free community detection. Physical Review E, 84(1), 016114

Maybe, Maybe there is a stable community solution between each-variable-is-a-group and all-in-one-group. (future work, or literature I'm not yet aware of)

Cheers!

Hi @E-Mendez,

Thanks for the reproducible example.

The Leiden algorithm is a variant of the Louvain, so it will almost always resemble Louvain (when using objective_function = "modularity"). The main difference between Leiden and Louvain are some additional procedures that are intended to identify higher modularity, if possible. More information on Leiden can be found here:

Traag, V. A., Waltman, L., & Van Eck, N. J. (2019). From Louvain to Leiden: Guaranteeing well-connected communities. Scientific Reports, 9(1), 1–12. https://doi.org/10.1038/s41598-019-41695-z

The Constant Potts Model is a different objective function that will require different values for it's resolution parameter. From the above article:

The interpretation of the resolution parameter γ is quite straightforward. The parameter functions as a sort of threshold: communities should have a density of at least γ, while the density between communities should be lower than γ. Higher resolutions lead to more communities and lower resolutions lead to fewer communities, similarly to the resolution parameter for modularity.

We have not yet performed simulations with the Leiden algorithm to provide any recommendations yet and therefore we do not provide any advice and would prefer that people look into the algorithm before using it.