vanheeringen-lab/genomepy

search command fails with KeyError: 'submitter'

Closed this issue · 3 comments

On a fresh conda/mamba installation with:

$ mamba create -n genomepy genomepy
  • version
(genomepy)$ genomepy --version
genomepy, version 0.16.0
  • Test search command
(genomepy)$ genomepy search e coli
12:46:31 | INFO | Downloading assembly summaries from GENCODE
12:46:37 | INFO | Downloading assembly summaries from UCSC
12:46:42 | INFO | Downloading assembly summaries from Ensembl
12:46:51 | INFO | Downloading assembly summaries from NCBI, this will take a while...
genbank_historical: 43.3k genomes [00:01, 36.7k genomes/s]
refseq_historical: 69.7k genomes [00:01, 41.8k genomes/s]
genbank: 1.69M genomes [00:25, 66.6k genomes/s]
refseq: 307k genomes [00:04, 61.7k genomes/s]
Traceback (most recent call last):
  File "/home/nikos/miniconda3/envs/genomepy/bin/genomepy", line 10, in <module>
    sys.exit(cli())
             ^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/cli.py", line 412, in search
    for row in genomepy.search(term, provider, exact, size):
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/__init__.py", line 136, in search
    for row in p.search(term, exact, size):
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/base.py", line 394, in search
    for name in search_function(term, exact):
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/base.py", line 334, in _search_text
    texts = [name] + [str(metadata[f]) for f in self.description_fields]
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/base.py", line 334, in <listcomp>
    texts = [name] + [str(metadata[f]) for f in self.description_fields]
                          ~~~~~~~~^^^
KeyError: 'submitter'
  • Test example search command with xenopus tropicalis
(genomepy)$ genomepy search xenopus tro
name                 provider accession         tax_id annotation species                                  other_info
                                                        n r e k   <- UCSC options (see help)
UCB_Xtro_10.0        Ensembl  GCA_000004195.4     8364     ✓      Xenopus tropicalis                       2020-09-Ensembl/2021-02
xenTro1              UCSC     na                  8364  ✗ ✗ ✗ ✗   Xenopus tropicalis                       Oct. 2004 (JGI 3.0/xenTro1)
xenTro10             UCSC     GCF_000004195.4     8364  ✓ ✓ ✗ ✗   Xenopus tropicalis                       Nov. 2019 (UCB_Xtro_10.0/xenTro10)
xenTro2              UCSC     na                  8364  ✗ ✓ ✓ ✗   Xenopus tropicalis                       Aug. 2005 (JGI 4.1/xenTro2)
xenTro3              UCSC     GCA_000004195.1     8364  ✗ ✓ ✓ ✗   Xenopus tropicalis                       Nov. 2009 (JGI 4.2/xenTro3)
xenTro7              UCSC     GCA_000004195.2     8364  ✓ ✓ ✗ ✗   Xenopus tropicalis                       Sep. 2012 (JGI 7.0/xenTro7)
xenTro9              UCSC     GCA_000004195.3     8364  ✓ ✓ ✓ ✗   Xenopus tropicalis                       Jul. 2016 (Xenopus_tropicalis_v9.1/xenTro9)
Traceback (most recent call last):
  File "/home/nikos/miniconda3/envs/genomepy/bin/genomepy", line 10, in <module>
    sys.exit(cli())
             ^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/cli.py", line 412, in search
    for row in genomepy.search(term, provider, exact, size):
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/__init__.py", line 136, in search
    for row in p.search(term, exact, size):
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/base.py", line 394, in search
    for name in search_function(term, exact):
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/base.py", line 334, in _search_text
    texts = [name] + [str(metadata[f]) for f in self.description_fields]
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/nikos/miniconda3/envs/genomepy/lib/python3.11/site-packages/genomepy/providers/base.py", line 334, in <listcomp>
    texts = [name] + [str(metadata[f]) for f in self.description_fields]
                          ~~~~~~~~^^^
KeyError: 'submitter'

This only happens with NCBI as the provider.

With a quick check on the assembly_summary_refseq_historical.txt from https://ftp.ncbi.nlm.nih.gov/genomes/ASSEMBLY_REPORTS/ there is not a submitter field but an asm_submitter. Maybe something relevant for this line in the NcbiProvider.

It is, thank you for the warning! I'll try to get a hotfix out

Hotfix 0.16.1 is out and should be available on Conda in an hour or so.

Closing the issue. Feel free to reopen this is the error was not fixed.