jodyphelan/TBProfiler

`tb-profiler update_tbdb --match_ref` fails

Closed this issue ยท 11 comments

On v6.2.0, I run:
tb-profiler update_tbdb --match_ref myref.fna

I get:

ValueError: Command Failed:
/bin/bash -c set -o pipefail; tb-profiler create_db --prefix tbdb --csv mutations.csv --watchlist watchlist.csv --rules rules.txt --match_ref /test/tbdb/myref.fna --load

...

File "pysam/libcfaidx.pyx", line 121, in pysam.libcfaidx.FastaFile.__cinit__
  File "pysam/libcfaidx.pyx", line 153, in pysam.libcfaidx.FastaFile._open
OSError: file `/test/tbdb/myref.fna` not found

It looks like it's looking for the ref in tbdb dir and not the parent where it is located.

Thanks for your help and tool!

Hi @schorlton-bugseq

Ah I think you found a bug there. Try add the full path to your reference file and it should work. I'll patch this in the next release.

Hi! When I supply the full path it does fix that particular error but I get an error downstream. The error is a keyError where the process is trying to look for the header of my reference and it is missing e.g. for a fasta:

>test
ATGGC

gives the error:

Traceback (most recent call last):
  File "/home/tom/micromamba/bin/tb-profiler", line 583, in <module>
    args.func(args)
  File "/home/tom/micromamba/bin/tb-profiler", line 242, in main_create_db
    pp.create_db(args,extra_files=extra_files)
  File "/home/tom/micromamba/lib/python3.10/site-packages/pathogenprofiler/db.py", line 505, in create_db
    write_bed(
  File "/home/tom/micromamba/lib/python3.10/site-packages/pathogenprofiler/db.py", line 120, in write_bed
    if genome_end > chrom_lengths[gene_info[gene].chrom]:
KeyError: 'test'

This can be rectified by renaming the header to match the original tbprofiler reference (>chromosome).

Thank you!

Just checking - are using this refrence genome: https://www.ncbi.nlm.nih.gov/nuccore/NC_000962.3?

Yes it is, it is also the same number of BP as the original TBProfiler reference

Hi @jodyphelan I get the same KeyError when trying to use the --match_ref flag

Ok it looks like the issue arises when tb-profiler update_tbdb is run first without --match_ref and then with. Try removing the tbdb directory is downloaded and then run your tb-profiler update_tbdb --match_ref /path/to/ref.fa and see if that works.

HI @jodyphelan I gave this a try but I run against the same error

Thanks

It still caused a (different error) when I ran it, but I was able to get it work.

if my reference is in ~/reference.fa I ran tb-profiler update_tbdb --match_ref reference.fa --commit <tbdb_commit> which then goes on to create ~/tbdb. I get an OSError: FileNotFound and mv reference.fa tbdb and run it again and it seems to work. If I remember this workaround didn't work when I tried it previously.

Oh yeah it requires the full path to the reference file in the release version but this is fixed in 1e4c872

Sorry yes you're right, I tried that originally and forgot when I tried it with the new update. It seems to be working now. Thanks ๐Ÿ˜„

Great! will close this now but if there are any more related issues feel free to reopen