/SpaTrio

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

SpaTrio v1.0.0

Revealing spatial multimodal heterogeneity in tissues with SpaTrio

Penghui Yang, Lijun Jin, Jie Liao, ..., Xiaohui Fan*

SpaTrio is a computational tool based on optimal transport that can align single-cell multi-omics data in space while preserving the spatial topology of the tissue section and local geometry of modality

Image text

Requirements and Installation

This toolkit is written in both R and Python programming languages. The core optimal transport algorithm is implemented in Python, while the initial data preparation and downstream multimodal analysis are written in R.

Installation of SpaTrio (Python package)

deep-forest 0.1.5 numpy 1.22.4 pandas 1.4.3 scikit-learn 1.2.0 scipy 1.5.2 scanpy 1.9.1 anndata 0.7.5 igraph 0.9.11 louvain 0.7.1 matplotlib 3.5.2

# We recommend using Anaconda, and then you can create a new environment.
# Create and activate Python environment
conda create -n spatrio python=3.8
conda activate spatrio

# Install requirements
cd SpaTrio-main
pip install -r requirements.txt

# Install spatrio
python setup.py build
python setup.py install

Installation of SpaTrio (R package)

R >4.0

install.packages("doParallel")
BiocManager::install("ConsensusClusterPlus")

# Install SpaTrio package from local file
install.packages("SpaTrio_1.0.0.tar.gz", repos = NULL, type = "source")

Quick Start

To use SpaTrio we require formatted .csv files as input (i.e. read in by pandas).

  • multi_rna.csv/spatial_rna.csv (The gene expression matrix of cells/spots)
Cell1 ··· Celln
Gene1 0 ··· 1
··· ··· ··· ···
Genem 2 ··· 1
  • multi_meta.csv/spatial_meta.csv (The meta information matrix of cells/spots)
id type
Cell1 Cell1 A
··· ··· ···
Celln Celln B
  • emb.csv (The low-dimensional embedding matrix of cells)
emb1 ··· embk
Cell1 1.997 ··· -0.307
··· ··· ··· ···
Celln 2.307 ··· 2.119
  • pos.csv (The spatial location matrix of spots)
x y
Cell1 0.28 10.65
··· ··· ···
Celln 5.98 2.16

At the same time, we also support additional specifications of the number of cells in each spot.

  • expected_num.csv (The number of cells contained in each spot)
cell_num
Spot1 5
··· ···
Spotj 2

In some examples of simulated data, the number of cell types in the spot is given (ref_counts.csv). These data will be converted to expected_num for use.

  • ref_counts.csv (The number of celltypes contained in each spot)
Celltype 1 ··· Celltype i
Spot1 0 ··· 2
··· ··· ··· ···
Spotj 1 ··· 0

We have included two test datasets (demo1 & demo2) in the tutorial/data/ of this repository as examples to show how to use SpaTrio to align cells to space.

Simulated data in the stripe pattern:

Simulated data in the ring pattern:

More importantly, we support directly calling the core functions written in Python from the R language to facilitate downstream analysis.

DBiT-seq mouse embryo datasets (Google Drive):

10x Visium+ADT mouse liver datasets (Google Drive):

Tutorials

We have applied SpaTrio on different tissues of multiple species, here we give step-by-step tutorials for all application scenarios. And preprocessed datasets used can be downloaded from Google Drive.

About

Should you have any questions, please feel free to contact the author of the manuscript, Mr. Penghui Yang (yangph@zju.edu.cn).

References

Penghui Yang, et al. Revealing spatial multimodal heterogeneity in tissues with SpaTrio, Cell Genomics, 2023, https://doi.org/10.1016/j.xgen.2023.100446.