/Circe

Co-accessibility network from single-cell ATAC-seq data. Python code, based on Cicero package (R).

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

Circe logo


CIRCE: Cis-regulatory interactions between chromatin regions

Unit_Tests codecov PyPI version Downloads

Description

This repo contains a python package for inferring co-accessibility networks from single-cell ATAC-seq data, using skggm for the graphical lasso and scanpy for data processing.

It is based on the pipeline and hypotheses presented in the manuscript "Cicero Predicts cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data" by Pliner et al. (2018). This R package Cicero is available here.


Metacalls computation might create differences, but scores will be identical applied to the same metacalls (cf comparison plots below). It should run significantly faster than Cicero (e.g.: running time of 5 sec instead of 17 min for the dataset 2).

If you have any suggestion, don't hesitate ! This package is still a work in progress :)

Installation

The package can be installed using pip:

pip install circe-py

and from github

pip install "git+https://github.com/cantinilab/circe.git"

Warning: If you clone the repo, don't stay in the repo to run your script because python will import the non-compiled cython file (probable error: circe.pyquic does not have a quic function)

Minimal example

import anndata as ad
import circe as ci

# Load the data
atac = ad.read_h5ad('atac_data.h5ad')
atac = ci.add_region_infos(atac)

# Compute the co-accessibility network
ci.compute_atac_network(atac)

# Extract the network and find CCANs modules
circe_network = ci.extract_atac_links(atac)
ccans_module = ci.find_ccans(atac)

Visualisation

ci.plot_connections(
    adata,
    chromosome="chr1",
    start=1e7,
    end=1.3e7

Comparison to Cicero R package


On the same metacells obtained from Cicero code.

All tests can be found in the circe benchmark repo

Real dataset 2 - subsample of 10x PBMC (2021)

  • Pearson correlation coefficient: 0.999958
  • Spearman correlation coefficient: 0.999911

Performance on real dataset 2:

  • Runtime: ~100x faster
  • Memory usage: ~5x less

Coming:

  • Calculate metacells !
  • Add stats on similarity on large datasets.
  • Add stats on runtime, memory usage.
  • Implement the multithreading use. Should speed up even more.
  • Fix seed for reproducibility.

Usage

It is currently developped to work with AnnData objects. Check Example1.ipynb for a simple usage example.

Citation

Trimbour Rémi (2024). Circe: Co-accessibility network from ATAC-seq data in python (based on Cicero package). Package version 0.2.0.