A simple and efficient Python implementation of Mapper algorithm for Topological Data Analysis
-
Installation:
pip install tda-mapper
-
Documentation: https://tda-mapper.readthedocs.io/en/main/
-
Demo App: https://tda-mapper-app.streamlit.app/
The Mapper algorithm is a well-known technique in the field of topological data analysis that allows data to be represented as a graph. Mapper is used in various fields such as machine learning, data mining, and social sciences, due to its ability to preserve topological features of the underlying space, providing a visual representation that facilitates exploration and interpretation. For an in-depth coverage of Mapper you can read the original paper.
This library contains an implementation of Mapper, where the construction of open covers is based on vp-trees for improved performance and scalability. The details about this methodology are contained in our preprint.
Step 1 | Step 2 | Step 3 | Step 4 |
---|---|---|---|
Chose lens | Cover image | Run clustering | Build graph |
Here you can find an example to use to kickstart your analysis. In this toy-example we use a two-dimensional dataset of two concentric circles. The Mapper graph is a topological summary of the whole point cloud.
import numpy as np
from sklearn.datasets import make_circles
from sklearn.decomposition import PCA
from sklearn.cluster import DBSCAN
from tdamapper.core import MapperAlgorithm
from tdamapper.cover import CubicalCover
from tdamapper.plot import MapperLayoutInteractive
X, y = make_circles( # load a labelled dataset
n_samples=5000,
noise=0.05,
factor=0.3,
random_state=42)
lens = PCA(2).fit_transform(X)
mapper_algo = MapperAlgorithm(
cover=CubicalCover(
n_intervals=10,
overlap_frac=0.3),
clustering=DBSCAN())
mapper_graph = mapper_algo.fit_transform(X, lens)
mapper_plot = MapperLayoutInteractive(
mapper_graph,
colors=y, # color according to categorical values
cmap='jet', # Jet colormap, for classes
agg=np.nanmean, # aggregate on nodes according to mean
dim=2,
iterations=60,
seed=42,
width=600,
height=600)
fig_mean = mapper_plot.plot()
fig_mean.show(config={'scrollZoom': True})
mapper_plot.update( # reuse the plot with the same positions
colors=y,
cmap='viridis', # viridis colormap, for ranges
agg=np.nanstd, # aggregate on nodes according to std
)
fig_std = mapper_plot.plot()
fig_std.show(config={'scrollZoom': True})
Dataset | Mapper graph (average) | Mapper graph (deviation) |
---|---|---|
More examples can be found in the documentation.
You can also run a demo app locally by running
pip install -r app/requirements.txt
streamlit run app/streamlit_app.py
If you want to use tda-mapper in your work or research, you can cite the archive uploaded on Zenodo, pointing to the specific version of the software used in your work.
If you want to cite the methodology on which tda-mapper is based, you can use the preprint.