Output is not sorted with --sort-output
AntonBogun opened this issue · 2 comments
AntonBogun commented
Reproduction:
- Place the release themisto binary in the repository root
- Build my_index like shown on the README:
./themisto_binary build -k 31 -i example_input/coli_file_list.txt --index-prefix my_index --temp-dir temp --mem-gigas 2 --n-threads 4 --file-colors
- Pseudoalign .fastq file with the following contents:
@Example
CTTTGTGCGCTTCACTCATGTTCCACGCCACCATCAACAGCAGGGCAGCCATGGCGGAAAGCGGCAGCCAGGAGAGCAGCGGTGCCAGTACCAGCAGGGC
+
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
(the last line has to have a newline, otherwise the file cannot be parsed)
with the following command:
./themisto_binary pseudoalign -i my_index -q "example.fastq" -o "out.txt" --temp-dir temp --threshold 0.7 --sort-output
- The file has the following contents:
0 0 2 1
the values are not sorted and should be "0 1 2"
jnalanko commented
Hi,
Sorry for the slow response. The --sort-output option actually just sorts the lines in the output file so the reads are listed in the same order as in the input. We might want to add another option to also sort the color identifiers within a line.