/SLAMseq

A customized SLAM-seq analysis pipeline

Primary LanguageShellMIT LicenseMIT

SLAMseq

A customized SLAM-seq analysis pipeline

Pipeline scripts

run_full_pipeline.sh # a bash script to run the main pipeline
vcf2reads.pl # an auxiliary Perl script

Pipeline Description

Briefly, the raw RNA reads were trimmed with Trimmomatic (v0.39; options: ‘ILLUMINACLIP:adapter.fa:2:30:10 SLIDINGWINDOW:5:20 MINLEN:36’). The reverse complemented version of the R2 trimmed reads was further generated by Seqtk (v1.3). The mismatch-permissive read aligner NextGenMap (v0.5.5; options: -t 8 -n 1 --strata --bam --slam-seq 2) was used to align the trimmed R1 and reverse complemented R2 reads to the reference genome. Samblaster (v0.1.26), Samtools (v1.9), and Slamdunk (version: 0.4.3; options: filter -mq 2 -mi 0.8 -nm -1) were used to sort, index, and filter the resulting bam files. The bam file was further converted to mpileup file by Samtools (options: -B -A --output-QNAME) for downstream analysis. Varscan (v2.3.9; options: mpileup2snp --strand-filter 1 --min-var-feq 0.2 --min-coverage 10 --variants 1) was used to call variants into a vcf file. The in-house perl script vcf2reads.pl (shipped together with this pipeline) was used to screen the mpileup file along with the vcf file to identify reads with at least two T->C base conversion events. Such base converted reads were extracted from the raw read sets by Seqtk to run the mapping process again with NextGenMap for downstream visualization and further analysis.