prophyle/prophex

Paired-end reads

simonepignotti opened this issue · 0 comments

Add support for paired-end reads to the query command.

Updated specification

Usage:   prophex query [options] <index_prefix> <in1.fq> [in2.fq]
...

Behavior

Each pair should be concatenated and separated by a N character.
The k-mers overlapping that position should have a specific marker in the output, e.g. C (concatenation).

Example

k=4

in1.fq:

@read1/1
ACGT
+
!!!!
...

in2.fq:

@read1/2
TGCA
+
!!!!
...

Extended Kraken format

output:

U    read1    0    8    ref1:1 C:4 ref1:1

Bitmask output format (#14 )

The hit and coverage masks should not contain the concatenation k-mers, but the two reads should be separated by a pipe (|).

read1	ref1	8	2	8	1|1

Alternative solutions

If there is a cleaner way to obtain the same result without concatenating reads with N, we should consider it (e.g. query the two parts indipendently).