No output from example API
Closed this issue ยท 2 comments
Describe the bug
Hi I'm trying to run the example API (E. coli's K-12 MG1655) but it only returns an empty synteny_matched.tsv
file
note: it seems like the pynteny version that was built was v1.0.0 using conda. Does v1.1.0 fix this issue?
update2: I have tried the docker image (https://github.com/Robaina/Pynteny/pkgs/container/pynteny) which contains v1.1.0 and it doesn't seem to have fixed the issue
To Reproduce
Steps to reproduce the behavior:
mamba create -n pynteny -c bioconda -c conda-forge python=3.10 pynteny
conda activate pynteny
pynteny download --outdir pgap/hmms --unpack
mkdir example_api
wget https://github.com/Robaina/Pynteny/blob/main/tests/test_data/MG1655.gb
- Create
api_example.py
using code below
from pathlib import Path
from pandas import DataFrame
from pynteny.filter import SyntenyHits
from pynteny import Search, Build, Download
Build(
data="MG1655.gb",
outfile="labelled_MG1655.fasta",
logfile=None
).run()
# Initialize class
search = Search(
data="labelled_MG1655.fasta",
synteny_struc="<leuD 0 <leuC 1 <leuA",
hmm_dir=None,
hmm_meta=None,
outdir="example_api/",
prefix="",
hmmsearch_args=None,
gene_ids=False,
logfile="example_api/pynteny.log",
processes=20,
unordered=False,
)
# Parse gene IDs in synteny structure according to PGAP HMM database metadata
parsed_struc = search.parse_genes(synteny_struc="<leuD 0 <leuC 1 <leuA")
# Update parsed synteny structure and Rrun Pynteny search
search.update("synteny_struc", parsed_struc)
synhits: SyntenyHits = search.run()
synhits_df: DataFrame = synhits.hits
synhits_df.head()
python ap_example.py
Expected behavior
Results in synteny structure tsv file
Screenshots
2023-09-05 14:05:13,670 | INFO: Building annotated peptide database
2023-09-05 14:05:14,061 | INFO: Parsing GenBank data.
2023-09-05 14:05:14,475 | INFO: Database built successfully!
2023-09-05 14:05:14,498 | INFO: Translated
"<leuD 0 <leuC 1 <leuA"
to
"<(TIGR00171.1|TIGR02084.1) 0 <(TIGR00170.1|TIGR02083.1) 1 <(TIGR00973.1|NF002084.0|TIGR00970.1)"
according to provided HMM database metadata
2023-09-05 14:05:14,555 | INFO: Searching database by synteny structure
2023-09-05 14:05:14,555 | INFO: Running Hmmer
2023-09-05 14:05:14,863 | INFO: Filtering results by synteny structure
2023-09-05 14:05:14,880 | INFO: Writing matching sequences to FASTA files
2023-09-05 14:05:14,880 | INFO: Finished!
Desktop (please complete the following information):
- OS: Ubuntu 22:04
Additional context
I see the hmmsearch results but there are not results printed out. Its just an empty file with the headers ie:
contig gene_id gene_number locus strand full_label hmm
I have also tried searching the other genes in the genbank file
for example "thrL 0 thrA"
which are the first 2 gene annotations, and it also returns nothing.
Hi @wchow,
thanks for reporting this bug. I can reproduce it using the latest available version as well (v.1.1.1).
Will have a look in the following days.
So the latest Pynteny version v1.1.2. solves this issue (#92). It is still unavailable in Bioconda but can be accessed from the latest docker image: docker pull ghcr.io/robaina/pynteny:main
Please, let me know if anything goes wrong. Closing the issue for now.
Thanks again for spotting this!