/louvain-leiden

Implementation of the Louvain and Leiden community detection algorithms.

Primary LanguageJupyter Notebook

Louvain and Leiden Algorithms

A python implementation of the Louvain and Leiden community detection algorithms.


Table of Contents

About

This project is an implementation of the Louvain and Leiden algorithms for community detection in graphs.
The implementation was conducted and tested using Python version 3.10.12.

Getting Started

In order to try out this implementation, do the following:

  1. Check out this repository, or download the latest version and unzip it somewhere.
  2. Install the dependencies, for example using pip install -r requirements.txt.
  3. Run jupyter notebook and explore the available notebooks.

Dependencies

To use this implementation, you only need the NetworkX graph library – it was implemented and tested with version 3.0:

networkx==3.0

For running the notebooks and working with the example datasets, you will need the following packages in addition to NetworkX:

jupyter_core==5.2.0
notebook==6.5.2
pandas==1.5.3
matplotlib==3.7.0

Lastly, to run the tests, generate code coverage information and run static analysis, the following packages are required:

pytest==7.2.1
pytest-cov==4.0.0
black==23.1.0
mypy==1.0.1
ruff==0.0.290

The python dependencies are specified in the requirements.txt file and can be installed (preferrably in a virtual environment) using the command pip install -r requirements.txt.
Alternatively, if you're running nix and if you are so inclined, you can also run nix-shell to install the dependencies.

Caligraphic Symbols in the Source Code

To keep the Python code as close as possible to the pseudocode and the notation used, Unicode symbols are used for some identifiers. The Python specification allows certain Unicode symbols to be used in identifiers, for example Greek symbols such as $\theta$, or special variants of Latin symbols such as $\mathcal{H}$ and $\mathcal{P}$. This functionality is described in more detail in the Python Enhancement Proposal 3131. In particular, one can replace uses of the special variants such as 𝓗 by their "normal" variants, such as H here. If you cannot use the unicode version of the code, for example due to rendering issues, you may also use the ascii branch instead.

Sample Notebooks

For demonstration purposes of this library, the following notebooks are included in this repository:

Tests and Static Analysis

Running the tests

This project has unit tests for almost all individual utility functions and the quality functions. In addition to that, there are tests which execute the community detection algorithms on known graphs and check the results.

The tests are located in the tests directory and can be executed running the pytest tool as follows from a shell in the repository root, generating branch coverage information in the process:

pytest --cov --cov-report=html

The coverage information will be placed in a directory called htmlcov.

Running static code analysis

Apart from the tests, this project employs a number of static analysis tools:

Linter: ruff

The tool ruff is used to lint the code and look for any problems with the code. A variety of checks is configured, among them lints which check for public functions having documentation blocks in the correct format, all functions being annotated with type information, etc.
To check that the code satisfies the linting rules, run the following command:

ruff .

It will list problems it has found in the code. If no lint finds an issue, then this command produces no output.

Formatter: black

The black formatter ensures that the python code is formatted according to its code style. To reformat the code according to the code style, run the following command:

black .

You should see the line 15 files left unchanged. when running it on a fresh checkout.

Type checker: mypy

Python is a dynamically typed language, meaning that the type of a variable may change over its lifetime and that type checking is done at runtime.
The code uses type hints for all functions, in order to add static types, which can be checked using a type checker, such as mypy. The type hints serve as further documentation of a function's usage and allow for a type checker to verify the types of variables and functions in a program, so that it may calls to functions with arguments of an incorrect type, for example.
To check this project using mypy, run the following command:

mypy .

You should see the message Success: no issues found in 15 source files.

Sources

The following papers were used as sources for the implementation:

The notebooks in this repository use the following datasets: