kharchenkolab/leidenAlg

Test failures with the development version of igraph

krlmlr opened this issue · 14 comments

We're changing the layout of igraph objects. This leads to check failures in your package, see https://github.com/igraph/rigraph/blob/f-igraph-t-idx-revdepcheck/revdep/problems.md and igraph/rigraph#789 for details.

To reproduce, please install the development version of igraph via

# install.packages("pak")
pak::pak("igraph/rigraph")

and run R CMD check on your package.

We plan to release an igraph update on June 12, two weeks from now. Can you please send an update to CRAN that fixes the checks?

This package seems to use its own copy of igraph. The error happens in R_SEXP_to_igraph() which still assumes the old format. Please retrieve an edgelist on the R side and construct your igraph object on the C/C++ side.

Happy to help.

This package accesses the internals of the igraph object through the R_SEXP_to_igraph() function, which was directly borrowed from an older version of igraph. This approach is fragile, as we make no guarantees about what the internal layout of igraph objects is. In fact, this layout will change in the next version, breaking this package.

The ideal solution here is to only use public API to access igraph objects. For example, convert the R-side igraph object to an edge list (as_edgelist(..., names=F)), and use that edge list to create an igraph_t in C (igraph_create()) that you can use with the Leiden code. Don't forget to convert the 1-based vertex IDs returned by R/igraph to 0-based ones before passing them to igraph_create().

Both interpretations resonate with me:

  • You can recreate the igraph object from an edgelist
  • You can avoid using igraph objects entirely by creating "your own" graph object

Thanks Kirill @krlmlr and Szabolcs @szhorvat
Returning to this now, and I've tried to follow all of the changes in the R igraph package v1.5.0---I might ask a few questions to save time (as you know the codebase better), if that's ok. I'm suffering from bandwidth problems with other projects.

Please retrieve an edgelist on the R side and construct your igraph object on the C/C++ side.

I guess we’ll have to pass edgelist, weights, directed attribute explicitly to find_partition and then use some constructor on C++ side to recreate a graph structure.

Current Rcpp wrapper function

find_partition <- function(graph, edge_weights, resolution = 1.0, niter = 2L) {
    .Call('_leidenAlg_find_partition', PACKAGE = 'leidenAlg', graph, edge_weights, resolution, niter)
}

Not to break our other R packages which uses this function, it sounds like the best plan of attack here would be to pass the graph variable via R into the R function find_partition(), extract the edgelist + weights + directed boolean from graph within R, and then pass these via C++ into the find_partition(). This avoids us having to use R_SEXP_to_igraph.

then use some constructor on C++ side to recreate a graph structure.

This C++ constructor should still be valid, right?
https://github.com/kharchenkolab/leidenAlg/blob/main/src/leidenalg/include/GraphHelper.h

Have I understood the situation correctly @krlmlr ?

Best, Evan

Thanks, Evan. This sounds about right.

We only changed the internal representation of the igraph object. As long as you avoid relying on it, you should be able to use the rest of your code unchanged.

Do you need help with the implementation?

@krlmlr

We only changed the internal representation of the igraph object. As long as you avoid relying on it, you should be able to use the rest of your code unchanged.

Perfect, thank you for the help!

Do you need help with the implementation?

You're welcome to do a PR for me :) Otherwise, I'll try to do it this weekend. CRAN hasn't complained yet. It doesn't sound that bad. (Time is limited so I wanted to make sure I wasn't spinning my wheels.)

Hi @evanbiederstedt
I assume CRAN has complained by now since leidenAlg fails its test

For CRMetrics, I have until 3rd July to get my tests running that depend on leidenAlg/igraph. Did you have a chance to look into this?

@rrydbirk
They never complained to me. I've been waiting.

Could you forward the email?

@evanbiederstedt
You should have it now.

@szhorvat @krlmlr

Thanks for the help.

I think I'm close to following Szabolcs's advice on this branch: https://github.com/kharchenkolab/leidenAlg/tree/bugfix/leidenalg

(Last commit: https://github.com/kharchenkolab/leidenAlg/blob/752a088524acaed433db9e0edb23d70882f5d8b0/src/leiden.cpp )

i.e.

The ideal solution here is to only use public API to access igraph objects. For example, convert the R-side igraph object to an edge list (as_edgelist(..., names=F)), and use that edge list to create an igraph_t in C (igraph_create()) that you can use with the Leiden code. Don't forget to convert the 1-based vertex IDs returned by R/igraph to 0-based ones before passing them to igraph_create().

Compilation works, but a segfault remains.

use that edge list to create an igraph_t in C

Do you have a canonical way to do this? @szhorvat

Do you need help with the implementation?

@krlmlr If you quickly see the remaining issue, that might save time :)

Hi @evanbiederstedt,

igraph data structures must be explicitly initialized and destroyed. Before you can do anything with an igraph_vector_t, you must initialize it with one of these functions: https://igraph.org/c/html/latest/igraph-Data-structures.html#vector-constructors-and-destructors Typically, igraph_vector_init(). When you are done using the vector, it must be destroyed using igraph_vector_destroy(), otherwise there is a memory leak.

The same goes for graphs. You can create them using igraph_create(), but when they are no longer needed, they must be destroyed using igraph_destroy().

I hope this helps.

Hi @szhorvat

I'm really grateful for the help here; sorry, I missed the details how these objects were allocated and destroyed.

I was thinking there were problems with the type casting, but that's been discussed elsewhere: https://stackoverflow.com/questions/4272175/avoiding-too-much-casting

Thanks! It looks like everything is working now, and it gives the same results as an environment with earlier versions ( igraph-1.4.0 and leidenAlg-1.0.5)

I'll reversion and put on CRAN. Let me know when the more efficient solutions mentioned in igraph/rigraph#864 are available.

I also think my way of converting between std::vector<int> to igraph_vector_t is substandard.