The code in this repository is accompanying the manuscript "Topologically Regularized Data Embeddings".
We provide a conda_env.yml file listing the required packages. You can install by creating a new conda environment
conda env create -f topembedding/conda_env.yml -p ./topo_env
conda activate topo_env/
(1) Install TopologyLayer
pip install git+https://github.com/bruel-gabrielsson/TopologyLayer.git
(2) Install Aleph by following the instructions on their GitHub.
git clone https://github.com/Pseudomanifold/Aleph.git
cd Aleph && mkdir build && cd build && cmake ../ && make && make test
cd bindings/python/aleph
python setup.py install
(3) Optional: Install DioDe to exerun the pseudotime analysis in the CellCyle notebook.
To execute the R scripts in the Scripts
folder you need:
- TDA
- ggplot2
- latex2exp
- gridExtra
Try out the example config files in /Code/config
by providing one of them to main.py.
python main.py Code/config/synthetic_random_optimize.yaml
Note that 'cell_cycle.yaml' takes about 5 Minutes to run. Upon completion the final embeddings will be shown.
- Synthetic data are generated using Data/datasets.py
- Cell trajectory (source): included as Data/CellCycle.rds
- Cell bifurcation (source): included as Data/CellBifurcation.rds
- Karate[1]: partially (graph) loaded from networkx and partially (weights) from Data/Karate.txt
- Harry Potter (source): included in Data/HarryPotter
In this folder you find one notebook for every dataset to reproduce the results in the experiments section. You can open the .hml version to see the code with output.
Folder "Scripts": contains two Jupyter notebooks and two R scripts to reproduce the visualizations of Section (2) in the manuscript.
The content of this repository is an extension of the code in topembedding developed by Robin Vandaele.
[1]: W.W. Zachary. An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33:452–473, 1977.