/CellBin

Tools for generating single-cell gene expression data

Primary LanguagePythonMIT LicenseMIT

stars downloads Python Versions Modified Apache 2.0

CellBin

CellBin: a highly accurate single-cell gene expression generation software for high resolution spatial transcriptomics.

Installation

  • Download the dev branch in CellBin repo, and install requirements.txt in a python==3.8 environment.
# python3.8 in conda env
conda create --name=CellBin python=3.8
conda activate CellBin
cd CellBin-dev
pip install -r requirements.txt  # install
  • The pyvips package needs to be installed separately. The following is referenced from pyvips

On Windows, first you need to use pip to install like,

$ pip install --user pyvips==2.2.1

then you need to download the compiled library from vips-dev-8.12, To set PATH from within Python, you need something like this at the start:

import os
vipshome = 'c:\\vips-dev-8.7\\bin'
os.environ['PATH'] = vipshome + ';' + os.environ['PATH']

On Linux,

$ conda install --channel conda-forge pyvips==2.2.1
  • Download the weight files and transfer them to the specified path (if the path does not exist, you can manually create the new folders).
weight file specified path
sold2_wireframe.tar , PWD: nJY4 cellbin\iqc\trackCross_net\sold2\ckpt
stereocell_bcdu_cell_256x256_220926.pth , PWD: nJY4 cellbin\weights
stereocell_bcdu_tissue_512x512_220822.onnx, PWD: nJY4 cellbin\weights

Tutorials

Test Data

Here is a mouse brain data set, which is generated by BGI STOmics. You only need to download the spatial gene expression data S200000135TL_D1.gem.gz (PWD: FMYk) and tiles SS200000135TL_D1.tif.gz (PWD: n2Ge). We recommend creating a new "data" folder under CellBin-dev, and decompressing "SS200000135TL_D1.tif.gz" to this folder. "S200000135TL_D1.gem.gz" does not need to be decompressed.

The purpose of open-sourcing this data set is to promote the research of spatial single-cell data in the field of life sciences through algorithms. STOmics reserves the right to interpret it.

Command Line

You can perform CellBin in one-stop, or perform image quality control, image stitching, image registration, tissue segmentation, nuclei segmentation, nuclei mask filtering and molecule labeling independently.

CellBin in one-stop is performed by command:

  • --tiles_path The path of all tiles.
  • --gene_exp_data The compressed file of spatial gene expression data.
  • --output_path The output path.
  • --chip_no Chip number of the Stereo-seq data.
cd scripts

# CellBin
python cellbin.py
--tiles_path /data/SS200000135TL_D1
--gene_exp_data /data/SS200000135TL_D1.gem.gz
--output_path /data/result
--chip_no SS200000135TL_D1

More about output

GUI

For some low-quality input data, using the CellBin pipeline will get wrong results. We have developed a manual tool based on pyqt5 to adapt to this scenario, you can get it through the cloud disk (PWD: 6bnz) . We have sorted out the details of installation and operation into the User Manual, which will help you get started quickly. The Mainwindow is shown in the figure below,

Fig 1 Main window of CellBinStudio

License and Citation

CellBin is released under the MIT license. Please cite SCellBin in your publications if it helps your research:

@article {Li2023.02.28.530414,
	author = {Li, Mei and Liu, Huanlin and Li, Min and Fang, Shuangsang and Kang, Qiang},
	title = {CellBin enables high accuracy single cell segmentation for spatial transcriptomic dataset},
	year = {2023},
	URL = {https://www.biorxiv.org/content/early/2023/03/01/2023.02.28.530414},
	journal = {bioRxiv}
}

Reference

https://github.com/matejak/imreg_dft
https://github.com/rezazad68/BCDU-Net
https://github.com/libvips/pyvips