Better explanation of Assemble log file
mikessh opened this issue · 0 comments
mikessh commented
Explain the READS*
terms in the assemble.log.txt
:
- I'm not sure exactly where reads are being dropped. The forward and reverse reads for each MIG are separately assembled and reads with too many mismatches are dropped. Are the remaining reads
READS_GOOD_FASTQ1
andREADS_GOOD_FASTQ2
? - How are
READS_TOTAL
andREADS_DROPPED_WITHIN_MIG
calculated?
I also noticed that READS_TOTAL
is less than reads with the master sequence in checkout.log.txt
- what filtering is occurring between checkout and assembly?
I think the MIG*
statistics make more sense. Is the following correct? After MIG assembly, if the FASTQ1 MIG is of size greater than MIG_COUNT_THRESHOLD
then it is counted in MIGS_GOOD_FASTQ1
. Same for FASTQ2. And then the MIGS_GOOD_TOTAL
is less than both MIGS_GOOD_FASTQ1
and MIGS_GOOD_FASTQ2
because a MIG is only kept if it is the specified read-size in both FASTQ1 and FASTQ2.