Build log - identifying how many kmers were added from different files
alexa-hks opened this issue · 0 comments
Hi there,
First off thank you for the amazing software and excellent documentation!
To give you some background to my project I am trying to identify variants from a multi-sample ldbg from a human chromosome. I have included "decontaminated" unmapped reads as well as reads that previously aligned to the chromosome, so that I can use McCortex to de novo assemble this multi-sample graph. I'd like to know to what extent the previously unmapped reads were incorporated into the graph.
Per sample, I have 7 sequence files that I am using to build the graph, 2 of which contain unmapped reads. In the output it seems as though all reads came from the first file, however I know that the SE read count shown for the first file is actually the sum of the reads in all the files (see full output attached).
Why does it appear this way? Is there something I could do so that I get this output split by the input file?
Kind regards and many thanks in advance,
Alexa
mccortex63 build -m 32G -k 51 --sample Xhosa_SAHGP034_SAHGP --seq2 SAHGP034.notCombined_1.fastq.gz:SAHGP034.notCombined_2.fastq.gz --seq SAHGP034_R1_se.fq.gz --seq SAHGP034_R2_se.fq.gz --seq SAHGP034.extendedFrags.fastq.gz --seq LP6005857-DNA_H03-R1_cont.fq.gz --seq LP6005857-DNA_H03-R2_cont.fq.gz Xhosa_SAHGP034_SAHGP.ctx
[05 Jan 2019 15:45:08-New][task] input: SAHGP034.notCombined_1.fastq.gz colour: 0
[05 Jan 2019 15:45:08-New] SE reads: 124,936,653 PE reads: 0
[05 Jan 2019 15:45:08-New] good reads: 124,927,096 bad reads: 9,557
[05 Jan 2019 15:45:08-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:08-New] bases read: 12,599,990,753 bases loaded: 12,598,372,994
[05 Jan 2019 15:45:08-New] num contigs: 124,927,214 num kmers: 6,352,012,294 novel kmers: 780,695,391
[05 Jan 2019 15:45:08-New][task] input: SAHGP034.notCombined_2.fastq.gz colour: 0
[05 Jan 2019 15:45:08-New] SE reads: 0 PE reads: 0
[05 Jan 2019 15:45:08-New] good reads: 0 bad reads: 0
[05 Jan 2019 15:45:08-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:08-New] bases read: 0 bases loaded: 0
[05 Jan 2019 15:45:08-New] num contigs: 0 num kmers: 0 novel kmers: 0
[05 Jan 2019 15:45:08-New][task] input: SAHGP034_R1_se.fq.gz colour: 0
[05 Jan 2019 15:45:08-New] SE reads: 0 PE reads: 0
[05 Jan 2019 15:45:08-New] good reads: 0 bad reads: 0
[05 Jan 2019 15:45:08-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:08-New] bases read: 0 bases loaded: 0
[05 Jan 2019 15:45:08-New] num contigs: 0 num kmers: 0 novel kmers: 0
[05 Jan 2019 15:45:09-New][task] input: SAHGP034_R2_se.fq.gz colour: 0
[05 Jan 2019 15:45:09-New] SE reads: 0 PE reads: 0
[05 Jan 2019 15:45:09-New] good reads: 0 bad reads: 0
[05 Jan 2019 15:45:09-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:09-New] bases read: 0 bases loaded: 0
[05 Jan 2019 15:45:09-New] num contigs: 0 num kmers: 0 novel kmers: 0
[05 Jan 2019 15:45:09-New][task] input: SAHGP034.extendedFrags.fastq.gz colour: 0
[05 Jan 2019 15:45:09-New] SE reads: 0 PE reads: 0
[05 Jan 2019 15:45:09-New] good reads: 0 bad reads: 0
[05 Jan 2019 15:45:09-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:09-New] bases read: 0 bases loaded: 0
[05 Jan 2019 15:45:09-New] num contigs: 0 num kmers: 0 novel kmers: 0
[05 Jan 2019 15:45:09-New][task] input: LP6005857-DNA_H03-R1_cont.fq.gz colour: 0
[05 Jan 2019 15:45:09-New] SE reads: 0 PE reads: 0
[05 Jan 2019 15:45:09-New] good reads: 0 bad reads: 0
[05 Jan 2019 15:45:09-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:09-New] bases read: 0 bases loaded: 0
[05 Jan 2019 15:45:09-New] num contigs: 0 num kmers: 0 novel kmers: 0
[05 Jan 2019 15:45:09-New][task] input: LP6005857-DNA_H03-R2_cont.fq.gz colour: 0
[05 Jan 2019 15:45:09-New] SE reads: 0 PE reads: 0
[05 Jan 2019 15:45:09-New] good reads: 0 bad reads: 0
[05 Jan 2019 15:45:09-New] dup SE reads: 0 dup PE pairs: 0
[05 Jan 2019 15:45:09-New] bases read: 0 bases loaded: 0
[05 Jan 2019 15:45:09-New] num contigs: 0 num kmers: 0 novel kmers: 0