Full Fasta info object without index building
oschwengers opened this issue · 1 comments
oschwengers commented
Hi and thanks a lot for this super-fast python library.
We'd like to use this in our tools like Bakta and Platon. Maybe I've overlooked something, but we need a way to parse FASTA files in the fastest possible way, i.e. w/o building an index, but with access to the sequence ID, description and sequence.
So, due to the readme there is:
import pyfastx
for name, seq in pyfastx.Fasta('test.fa.gz', build_index=False):
print(name, seq)
and:
import pyfastx
for seq in pyfastx.Fasta('test.fa.gz'):
print(seq.name)
print(seq.seq)
print(seq.description)
But what we actually need is:
import pyfastx
for seq in pyfastx.Fasta('test.fa.gz', build_index=False):
print(seq.name)
print(seq.seq)
print(seq.description)
Also, it would be best if the description would already exclude the FASTA id. I think this usecase would be interesting for many other users, as well.
Thanks again and best regards!
lmdu commented
Thank you for your suggestion. I will consider adding this feature in next major version.