breakpoint-assembly: A Python repository from dliang5

Everything outside of the folders are the tools needed to cluster.

cluster.py (and reading3_1.cpp once it's fixed) are the files that cluster reads
1. virutually does the same thing (reading3_1.cpp and cluster.py)
search_trans.py - post processing for dmel files and matches TE with the breakpoints to ensure more positive inversions
fastq2fa_qual.pl and parse_reads.pl - converts fastq to fasta and quality and parse the ID.
running3-1.sh - is how I ran the entire program save for phrap assembly to not waste space.

python2.7 or python3 cluster.py < *.sam > (ex. python3 cluster.py < 857.sam > ) output : good_<>-result - full view of the clusters with both forward and reverse clusters summary_<>-result
python2.7 search_trans.py * (ex. python2.7 search_trans.py 857)
perl fastq2fa_qual.pl < $fastq > $1 2> $2
1. ex. perl fastq2fa_qual.pl < ${fastq_file} > ${fasta_file} 2> ${quality_file}
parse_reads.pl - the input are piped using gzip -dc of the fastq file
1. use "cut -f2 "
2. ex. gzip -dc $SRRfastq | perl parse_reads.pl $idSUBMITFILE > or >> $fastq_file
phrap - phrap -vector_bound 0 -forcelevel 10 $fasta_name (quality file has to be of the same name)

fastq-dump
bwa mem
phrap - phrap -vector_bound 0 -forcelevel 10 $fasta_name (quality file has to be of the same name)