jenniferlu717/KrakenTools

Error when using --include-children

nblouin opened this issue · 2 comments

Hi there-
I am sure I am missing something trivial, but I am stuck and am looking for some help.

When I run the following command without "--include-children -r /project/bishalab/trimmedcaptured/UWLAR2177trim.krakenreportTesterSorted", all runs as expected; however when attempting to include children, I get the following error. I have tried sorted sand unsorted versions of the report. I have also pasted the first 10 lines of the report (sorted and unsorted) below in case I have an error in its format....or some other error. Thanks for any help.

┌─[10:24:50]─[nblouin@blog1]─[~]
└──> /project/bishalab/cbouley/software/KrakenTools/extract_kraken_reads.py -k /project/bishalab/trimmedcaptured/UWLAR2177trim.outputTester -s /project/bishalab/trimmedcaptured/UWLAR2177trim.fq.gz -o UWLAR2177.fasta -t 142786 --include-children -r /project/bishalab/trimmedcaptured/UWLAR2177trim.krakenreportTesterSorted
PROGRAM START TIME: 03-28-2023 16:27:25

STEP 0: PARSING REPORT FILE /project/bishalab/trimmedcaptured/UWLAR2177trim.krakenreportTesterSorted
Traceback (most recent call last):
File "/project/bishalab/cbouley/software/KrakenTools/extract_kraken_reads.py", line 446, in
main()
File "/project/bishalab/cbouley/software/KrakenTools/extract_kraken_reads.py", line 236, in main
while level_num != (prev_node.level_num + 1):
AttributeError: 'int' object has no attribute 'level_num'

┌─[10:41:35]─[nblouin@blog1]─[~]
└──> head /project/bishalab/trimmedcaptured/UWLAR2177trim.krakenreportTesterSorted
0.00 1 0 687329 Anelloviridae
0.00 3 0 2840056 Naldaviricetes
0.09 3671 0 2759 Eukaryota
0.11 4687 6 2157 Archaea
0.21 8819 0 10239 Viruses
46.59 1928974 1928974 U 0 unclassified
52.73 2183274 17083 2 Bacteria
53.41 2211736 10342 1 root
0.00 46 0 G 6 Azorhizobium
0.00 46 0 S 7 Azorhizobium caulinodans

┌─[10:41:44]─[nblouin@blog1]─[~]
└──> head /project/bishalab/trimmedcaptured/UWLAR2177trim.krakenreportTester
46.59 1928974 1928974 U 0 unclassified
53.41 2211736 10342 1 root
52.95 2192550 918 1 131567 cellular organisms
52.73 2183274 17083 2 Bacteria
29.20 1208947 27530 P 1224 Proteobacteria
11.46 474378 40277 C 1236 Gammaproteobacteria
5.20 215316 0 O 2887326 Moraxellales
5.20 215316 1083 F 468 Moraxellaceae
3.85 159397 63555 G 469 Acinetobacter
1.44 59753 58939 S 40214 Acinetobacter johnsonii

┌─[10:41:56]─[nblouin@blog1]─[~]

Hi, I also get the same error, also when using --include-parents

yeemey commented

I noticed that the rank codes are missing from a few lines in your report, e.g. the Eukaryota, Archaea, Viruses lines, but are present in others, e.g. "G" for Azorhizobium and "S" for Azorhizobium caulinodans. I had a similar issue that raised errors when I used --include children, and resolved it by manually adding the correct rank codes to the kraken2 report. There's an open issue on kraken2's repo about this: DerrickWood/kraken2#759