NanoCircle

The github reporsitory for the under development tool, NanoCircle 2020. Useful for identifying the coordinates of both simple and chimeric circular molecules, sequenced using long-read sequencing.

Some presteps to perform before running NanoCircle

STEP1 - Trimming and prehandling

adapter and barcode trimming

porechop -i bc05.reads.fastq -b bc05.barcode_trim -t 8

Use fastq-stats to obtain information regarding the sequences

STEP2 - Alignment of sequence reads

creating index

minimap2 -t 6 -x map-ont -d GRCh37.mmi GRCh37.fa

Alignment

minimap2 -t 8 -ax map-ont --secondary=no hg19.25chr.mmi read_file.fastq | samtools sort - > barcode.aln_hg19.bam
# -ax map-ont = Oxford Nanopore genomic reads
# --seconday=no With no reads mapped with SAM flag 0x100 (secondary flag). 
# hg19.25chr.mmi minimizer index for the reference

STEP3 - Identifying representative regions

bedtools genomecov + merge

bedtools genomecov -bg -ibam barcode_hg19.bam | bedtools merge -d 1000 -i stdin | sort -V -k1,1 -k2,2n > barcode_1000_cov.bed

Running NanoCircle to identify the eccDNA coordinates

STEP 4 - Classify the soft-clipped read supporting Simple eccDNA and soft-clipped supporting Chimeric eccDNA

python NanoCircle_arg.py Classify -i barcode_hg19.bam -d temp_reads

Which will be saved in a folder temp_reads containing both simple and complex reads in .bam format.

Create a .bai index for the read .bam

samtools index temp_reads/Simple_reads.bam
samtools index temp_reads/Chimeric_reads.bam

STEP 5 - Identify Simple eccDNA using the coverage file and classified reads

python NanoCircle_arg.py Simple -i barcode_1000_cov.bed -b temp_reads/Simple_reads.bam -q 60 -o barcode_Simple_circles.bed

STEP 6 - Identify Chimeric eccDNA using the coverage file and classified reads

python NanoCircle_arg.py Chimeric -i barcode_1000_cov.bed -b temp_reads/Chimeric_reads.bam -q 60 -o barcode_Chimeric_circles.bed

The output being a bed file with possible configurations of several chimeric eccDNA, since the identification extract reads originating from specific regions.

STEP 7 - Merge Chimeric eccDNA configurations using the coverage file and classified reads

python NanoCircle_arg.py Merge -i barcode_Chimeric_circles.bed -o barcode_Merged_chimeric.bed