/pynoncode

A pipeline for the analysis of Non-coding RNA profiling by high throughput sequencing

Primary LanguagePythonGNU General Public License v2.0GPL-2.0

#pynoncode

Installation

Clone this repository and then:

$ cd pynoncode/
$ python setup.py install

This will install the scripts found in the pyrnatools/scripts directory. For more information on the individual scripts, use the --help command after each script.

##Dependencies

  • numpy - This can be installed using:
	pip install numpy
  • cython - This can be installed using:
	pip install cython
  • bowtie version 1. Link
  • samtools version: 0.1.19 or greater. Link
  • A R installation.
  • DESeq2 Link.
  • The remaining python dependencies should be installed automatically. These are:

##Core Pipeline

  • pynon_align.py: Please trim adapters before using this program. Some programs for doing this are cutadapt. This converts FASTQ to FASTA and runs bowtie aligner to exact both unique and multimapped sequences. If the -c option is specified, pynon_align.py will trim tRNA CCA ends from the unmapped reads and then rerun the alignment step. The program will then use a GTF file to annotate the mapped fragments using HTSeq. The resulting files can be found in pynoncode directory and are called unique_mapped.BED and multi_mapped.BED

  • pynon_count.py: This uses the uniquely mapped reads and if -m is specified, will use the multi-mapped reads to create counts files of both the transcripts and fragments. Multi-mapped reads are distributed according to the fraction of unique counts the transcripts have.

  • pynon_ucsc.py: This create a UCSC formatted bigWig for use on the UCSC genome browser. The bigWig is called pynoncode.bw and can be found in the results directory.

  • pynon_diff.py: This uses DESeq2 to perform differential expression on transcripts and fragments. For examples of configuration files please see here

  • pynon_report.py: Plots profiles of transcripts and creates a HTML report of the differentially expressed fragments and transcripts. For examples of configuration files please see here

##Annotation

  • Inlcuded in this package are a mouse and human noncoding GTF. For more information, see here
  • However if you wish to use your own annotation, please make sure it is in GTF format. For more information see here.
  • Also please note the chromosome names in your GTF file and use the options to convert it to UCSC notation if required.