lmdu/pyfastx

Unable to index gzipped fasta file by name: KeyError: 'chrY does not exist in fasta file'

Opened this issue · 1 comments

I have loaded a few example gzipped fasta files, e.g. the following one:
hgdownload.cse.ucsc.edu/goldenPath/dm6/bigZips/dm6.fa.gz

I can load it, iterate and so on, and:

fasta = pyfastx.Fasta(./data/genomes/dm6.fa.gz)
fasta[0]
Out[49]: <Sequence> chr2L with length of 23513712

but

fasta['chr2L']
KeyError: 'chr2L does not exist in fasta file'

Also:

keys = fasta.keys()
keys[0]
Out[57]: 'chr2L'

but

fasta[keys[0]]
KeyError: 'chr2L does not exist in fasta file'

I sincerely hope that I have not misunderstood anything, I went by what was listed in the docs:

>>> # get sequence like dictionary
>>> s1 = fa['JZ822577.1']
>>> s1
<Sequence> JZ822577.1 with length of 333

I am in a conda environment, and installed pyfastx 0.9.1 using pip.

  • Ubuntu 22.04.2 LTS
  • conda 23.1.0
  • python 3.8.16
  • pyfastx 0.9.1
lmdu commented

Try the latest version 1.0.0. We have fixed this issue.