neo4j-labs/graph

Partitioning?

Closed this issue · 4 comments

I just saw your FOSDEM presentation and am curious if you think partitioning is in-scope for this project, and if so, if you've thought about how/where it should go? My interests are in k-way partitioning similar to packages like METIS and SCOTCH, with graph coarsening and KL/FM refinement, as well as the closely related problem of computing nested dissection orderings for sparse direct solvers.

s1ck commented

Are you thinking of a stand-alone partitioning algorithm on the same level as the current algorithms, such as page rank or WCC or are you imagining a partitioning as additional input to a computation for better load balancing?

I can see both options happening. I am not up-to-date with partitioning algorithms, though. Last time I checked, multi-level algorithms with coarsening and refinement phases were pretty good both in terms of results and performance. Iirc, METIS had some of those in their lib, I'm not aware of SCOTCH. I also looked into diffusion based algorithms a few years back that work similar to label propagation community detection with a size constraint on the communities.

Are you interested in implementing such an algorithm in graph?

I'm reasonably familiar with the algorithms, but don't have sufficient bandwidth at the moment. My research group mostly works on C/etc libraries for scientific computing, but we've been doing some porting to Rust and graph partitioning is one of the key libraries to hit critical mass such that Rust becomes viable for production use. The "easy" thing will be to bind METIS or SCOTCH and move on, but a native partitioner would help with distribution and enabling research. My question is mostly for longer term planning and what ideas to seed with students: is a partitioner something that you'd like to see developed in this package, versus in a separate package.

s1ck commented

I agree that a Rust native partitioner would be nice to have and I can see many applications for it. The main goal of the graph crate is to have implementations that leverage multi-core systems, i.e. parallel implementations that run on very large graphs. If your envisioned implementation falls into that category, I would be happy to accept a PR and merge it into graph. If you want to start with prototyping a (potentially) single-threaded implementation, I suggest doing this in a separate project with a dependency on graph_builder. I did something similar with subgraph matching (https://github.com/s1ck/subgraph-matching). That way, you can move fast with the development without being blocked by us reviewing PRs and once you're in a state where you think it's "done", we can talk about merging the result into graph. wdyt?

s1ck commented

@jedbrown closing this for now. Let me know, if you would like to discuss further.