/Full_length_transcripts_analysis

Computational pipeline for full-length transcripts with poly(A) tails.

Primary LanguageJupyter Notebook

Full-length transcript analysis

This repository is the pipeline and scripts of analysing the Nanopore and PacBio datasets for genome-wide characterizing isoform-specific poly(A) tail length, and includes snakemake workflow and ipython notebooks for generating the paper figures.

Software Dependencies

The snakemake workflow is implemented in Python3 and requires minimap2 and bedtools as addtional software. Dependent Packages are listed below:

  • pysam
  • numpy
  • pandas
  • matplotlib
  • pyranges
  • click
  • scipy

P.S. The code of “polyAcaller” is available at: https://github.com/zhailab/polyACaller.

Data availability

All data generated in this study were deposited in the National Genomics Data Center (https://bigd.big.ac.cn/) under accession number PRJCA003923.