/LatInt

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

LatInt: a Python package to help Interpret your Latent Space

The goal of this package is to facilitate the interpretation and quality control of low-dimensional latent spaces generated by dimensionality reduction methods, using heterogeneous input feature types (e.g. Numerical as well as image-based inputs).

In development:

This package was the results of the collective efforts of our lab prompted by the immediate need for some of these functions, and as such are extremely tailored to our workflows. It's actively under development and extra modules, generalization and documentation are all still underway. If you use a module that requires components of the learning of your latent space, such as a model or a dataset, it is assumed the learning has been done using pytorch. In the future we will generalize this.

Modules:

Getting started:

  • Install package:
pip install latint
  • Load your latent space by calculating it from your trained model and input data:
from latint.load import getLatentFromModel

latent = getLatentFromModel(model, data)
  • However you load your latent space, the result should either be a numpy array OR an anndata object, with the latent space accessible in adata.obsm['latent_key']
  • Note: The rest of this package assumes you're inputting either of them, some will specifically need the anndata object if the input data or metadata is also required for its functionality.

Optional:

  • Encapsulate latent space and input data in an anndata object
adata = ad.AnnData(data, dtype=data.dtype)
adata.obsm["latent"] = latent 
  • Add metadata of the input data to the adata object
metadata_df = pd.read_csv("/path/to/your/metadata.csv")
addMetadataFromPandas(adata, metadata_df)

Documentation

The generation of the documentation website is a work in progress, but all functional modules have clearly written docstrings.

Dependencies:

  • numpy
  • pandas
  • tqdm
  • matplotlib
  • torch
  • scanpy
  • anndata==0.8 We take a specific version of anndata because some mismatch between anndata and scanpy exists.