There is an error when I use -gb option.
Closed this issue · 2 comments
Thank you for enabling me to use your fantastic tool.
However, there is an issue causing an error when I use the '-gb' option.
I've tried installing it with conda and singularity, but I keep encountering the same error, even I used the example dataset and command on the tutorial.
The error message is as follows.
The version is 1.4.2
Can you provide a solution?
mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa
Traceback (most recent call last):
File "/opt/conda/bin/mummer2circos", line 10, in
sys.exit(main())
File "/opt/conda/lib/python3.7/site-packages/mummer2circos/init.py", line 68, in main
force_data_dir=args.force)
File "/opt/conda/lib/python3.7/site-packages/mummer2circos/mummer2circos.py", line 118, in init
minus_file, plus_file = self.gbk2circos_data(gbk2orf, minus_file=f"{self.circos_data_dir}/circos_orf_minus.txt", plus_file=f"{self.circos_data_dir}/circos_orf_plus.txt")
File "/opt/conda/lib/python3.7/site-packages/mummer2circos/mummer2circos.py", line 1184, in gbk2circos_data
start = str(feature.location.start + self.contigs_add[record.id][0])
KeyError: 'NZ_CP008828.1'
Hi @fsysy
The error is due to the fact that the record ids in the genbank file have version numbers (e.g 'NZ_CP008828.1') while the record ids in the fasta file don't (e.g 'NZ_CP008828' without the '.1'). To map records between the fasta and the gbk files, we need to use the same ids in both files.
You can fix this by adding version numbers to the fasta file, either manually or with sed:
sed -ri 's/^>(.*)/>\1.1/' genomes/NZ_CP008827.fna
Thank you for reporting this problem, I will fix the example dataset.