NCLscan-hybrid: A Shell repository from TreesLab

Manual of NCLscan-hybrid

Version: 1.1.0

NCLscan-hybrid, a tool using long-read sequencing (Pabio/Nanopore) to validate non-col-linear (NCL) transcripts (fusion, trans-splicing, and circular RNA)

Requirements

Python
bedtools==v2.25.0
samtools
minimap2
seqtk

We recommand to use conda to install the dependencies.

Installation

git clone https://github.com/TreesLab/NCLscan-hybrid.git

Usage

./NCLscan-hybrid.sh \
    -long [input long read fasta/fastq file] \
    -long_type [pb or ont] \
    -nclscan [NCLscan result file] \
    -c [configure file] \
    -o [out_prefix_name] \
    -t [number of threads]

Parameters

Parameter	Description
-long FILE	Long reads dataset.(FASTA or FASTQ)
-long_type TYPE	The type of the long reads dataset. ('pb' or 'ont')
-nclscan FILE	The results file from NCLscan.
-c CONFIG_FILE	Config file.
-o PREFIX	Prefix for output files.
-t INT	Number of threads.

The format of NCLscan results

#	Column
1	chr (donor)
2	pos (donor)
3	strand (donor)
4	chr (acceptor)
5	pos (acceptor)
6	strand (acceptor)
7	gene_symbol (donor)
8	gene_symbol (acceptor)
9	is_intragenic

The remaining columns generated by NCLscan are optional for NCLscan-hybrid.

Outputs

PREFIX.long_intra.result
PREFIX.long_inter.result

PREFIX.long_intra.result

#	Column	Description
1	NCL_event_id
2	#supporting_reads
3	has_reads_out_of_circle
4	#reads_out_of_circle
5	has_reads_rolling_circle
6	#reads_rolling_circle
7 ~ N		The remaining columns are from the original input file.

PREFIX.long_inter.result

#	Column	Description
1	NCL_event_id
2	#supporting_reads
3 ~ N		The remaining columns are from the original input file.

Visualization

To visualize the alignments of supporting reads of an supported NCL event, upload the BED files in the following directories to the UCSC genome browser.

pass2_intra_BrowserView/
WithinCircle_events_BrowserView/
pass2_inter_BrowserView/