/QISeq

A bioinformatical pipeline to process Quantitative Insertion-site Sequencing (QIseq) samples

Primary LanguagePerlGNU General Public License v3.0GPL-3.0

QISeq

A bioinformatical pipeline to process Quantitative Insertion-site Sequencing (QIseq) samples

Those script are part of the article: Quantitative Insertion-site Sequencing (QIseq): A new tool for high throughput phenotyping of transposon mutants XX XX XX

Prequirements

Following tools are expected to installed in the path

sga - to perform the correction of the reads
bwa - to perform the mapping
samtools - version 1.2 to work with cram files
picard - to mark duplicates
java - to start picard

IMPORTANT: We assume that the receipt of the illumina machine will know about adapter. Also, this adapter should be already trimmed in the bam files that will be given to the

Possible variables

Please ensure that all tools are in the path, and that markduplicates is in the CLASSPATH enviroment variable.

Variables to set
Some of the parameter can be be set through enviroment variables, like B=8; export B in bash
BWA_THREADS # amount of threads to be used for bwa mapping - default 1
MIN_UNIQUE # amount of unique reads to call an insertion site - default 4

Running

Assuming that the bam files is in the correct format, start the HISeq.start.sh for each direction (5'/3'), given the bam file, the direction, the reference and the offset bases to the adapter 4 for HiSeq and 5 for MiSeq.

For large runs we use for loop like
for ((i=48;$i<94;i++)) ; do
bsub.py 8 QISeq.$i QISeq.sh 17240_8#$i 3prim /lustre/scratch108/parasites/tdo/Pfalciparum/NF54/ICORN_Feb2014_150bp/PfNF54.fasta 4;
done

The results can be collect with the QISeq.join.pl script.