SystemError: <class 'Fasta'> returned a result with an error set; on loading gzipped fasta
Opened this issue · 2 comments
jolo2486 commented
When loading e.g. the below gzipped fasta file:
hgdownload.cse.ucsc.edu/goldenPath/dp3/bigZips/dp3.fa.gz
I get:
fasta = pyfastx.Fasta('./data/genomes/dp3.fa.gz')
RuntimeError: get seq count and length error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
Cell In[61], line 1
fasta = pyfastx.Fasta('./data/genomes/dp3.fa.gz')
SystemError: <class 'Fasta'> returned a result with an error set
I am in a conda environment, and installed pyfastx 0.9.1 using pip.
- Ubuntu 22.04.2 LTS
- conda 23.1.0
- python 3.8.16
- pyfastx 0.9.1
lmdu commented
First, delete the previous generated index file dp3.fa.gz.fxi, and then use pyfastx.Fasta to reindex it. If this does not work, please let me known.
maximilianmordig commented
Still fails.
The issue persists, even after deleting the index file (fxi
file)
curl https://s3-us-west-2.amazonaws.com/human-pangenomics/T2T/CHM13/assemblies/analysis_set/chm13v2.0.fa.gz -O
.
It seems to be related to the index though:
import pyfastx; pyfastx.Fasta("chm13v2.0.fa.gz", build_index=False)
works, but
import pyfastx; pyfastx.Fasta("chm13v2.0.fa.gz")
does not.
I am using pyfastx version 1.1.0