/comparative-embedding-visualization

A Jupyter widget for comparing two embeddings with shared labels by their confusion, neighborhoods, and size.

Primary LanguagePythonApache License 2.0Apache-2.0

Comparative Embedding Visualization with cev

pypi version pypi version build status notebook examples ISMB BioVis 2023 Poster

cev is an interactive Jupyter widget for comparing a pair of 2D embeddings with shared labels.
Its novel metric allows to surface differences in label confusion, neighborhood composition, and label size.


Teaser

The figure shows data from Mair et al. (2022) that were analyzed with Greene et al.'s (2021) FAUST method.
The embeddings were generated with Greene et al.'s (2021) annotation transformation and UMAP.


cev is implemented with anywidget and builds upon jupyter-scatter.

Installation

Warning: cev is new and under active development. It is not yet ready for production and APIs are subject to change.

pip install cev

Getting Started

import pandas as pd
from cev.widgets import Embedding, EmbeddingComparisonWidget

umap_embedding = Embedding.from_ozette(df=pd.read_parquet("../data/mair-2022-tissue-138-umap.pq"))
ozette_embedding = Embedding.from_ozette(df=pd.read_parquet("../data/mair-2022-tissue-138-ozette.pq"))

umap_vs_ozette = EmbeddingComparisonWidget(
    umap_embedding,
    ozette_embedding,
    titles=["Standard UMAP", "Annotation-Transformed UMAP"],
    metric="confusion",
    selection="synced",
    auto_zoom=True,
    row_height=320,
)
umap_vs_ozette
User interface of cev's comparison widget

See notebooks/getting-started.ipynb for the complete example.

Development

First, create a virtual environment with all the required dependencies. We highly recommend to use hatch, which installs and sync all dependencies from pyproject.toml automatically.

hatch shell

Alternatively, you can also use conda.

conda env create -n cev python=3.11
conda activate cev

Next, install cev with all development assets.

pip install -e ".[notebooks,dev]"

Finally, you can now run the notebooks with:

jupyterlab

Commands Cheatsheet

If using hatch CLI, the following commands are available in the default environment:

Command Action
hatch run fix Format project with black . and apply linting with ruff --fix .
hatch run fmt Format project with black . and apply linting with ruff --fix .
hatch run check Check formatting and linting with black --check . and ruff ..
hatch run test Run unittests with pytest in base environment.
hatch run test:test Run unittests with pytest in all supported environments.

Alternatively, you can devlop cev by manually creating a virtual environment and managing dependencies with pip.

Our CI linting/formatting checks are configured with pre-commit. We recommend installing the git hook scripts to allow pre-commit to run automatically on git commit.

pre-commit install # run this once to install the git hooks

This will ensure that code pushed to CI meets our linting and formatting criteria. Code that does not comply will fail in CI.

Release

releases are triggered via tagged commits

git tag -a vX.X.X -m "vX.X.X"
git push --follow-tags

License

cev is distributed under the terms of the Apache License 2.0.

Citation

If you use cev in your research, please cite the following preprint:

@article{manz2024cev,
 title={A General Framework for Comparing Embedding Visualizations Across Class-Label Hierarchies},
 url={osf.io/puxnf},
 DOI={10.31219/osf.io/puxnf},
 publisher={OSF Preprints},
 author={Manz, Trevor and Lekschas, Fritz and Greene, Evan and Finak, Greg and Gehlenborg, Nils},
 year={2024},
 month={Apr}
}