CIForm, a Transformer-based model, can annotate cell types.
Instructions and examples are provided in the following tutorials.
Python 3.9.12
PyTorch >= 1.5.0
numpy
pandas
scipy
sklearn
Scanpy
random
reference dataset.
cell type label of reference dataset.
query dataset.
After training the CIForm model, the model will be save at: "log/CIForm.tar".
The model prediction is saved in the log/y_predicts.npy.
import CIForm as CI
pred_result = CI.ciForm(s, referece_datapaths, Train_names, Testdata_path,Testdata_name)
in which
- s=The length of sub-vector,
- referece_datapaths=The path of annotated scRNA-seq datasets
- Train_names=The name of annotated scRNA-seq datasets
- Testdata_path=The path of query scRNA-seq datasets
- Testdata_name=The name of query scRNA-seq datasets
It is recommended that The label file be in the same directory as the corresponding data set and be named Labels.csv The label file should be a n rows * 1 column vector. For example,
- Cell Type annotation on Intra-datasets
- Cell Type annotation on Inter-datasets
- Cell Type annotation on Inter-datasets using multi-source
-
Pancreas datasets(Baron Mouse,Baron Human,Xin,Muraro,Segerstolpe), TM(Tabula Muris), Zhang 68K, AMB .
-
Immune datasets(Oetjen, Dahlin, pbmc_10k_v3, and Sun[52]) and Brain datasets(Rosenberg, Zeisel, Saunders).
CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbad195/7169137