Possible memory leak in keys
sanjaysrikakulam opened this issue · 4 comments
sanjaysrikakulam commented
Hi,
I have downloaded the trembl dataset from UniProtKB (48GiB) and created an index (23GiB).
from pyfastx import Fasta
# Create index
fobj = Fasta('uniprot_trembl.fasta.gz', build_index=True)
# Extract keys
sample_ids = fobj.keys()
# Apply length filter
sample_ids.filter(sample_ids >= 11)
# Iterate through the filtered sample ids (before the end of the iteration a total of 12.4 GiB of memory is seen to be consumed)
dummy_count = 0
for idx, key in enumerate(sample_ids, start=1):
dummy_count += 1
The above iteration is shown here for reproducible purposes and to demonstrate that the memory is consumed here no matter what. Can you please take a look at this and offer a solution?
Thanks in advance!
P.S:
System info:
OS: CentOS 7
Python 3.7.7
pyfastx version: 0.8.3
lmdu commented
Thank you for report this issue. I have found this bug and will fix it in next few days
sanjaysrikakulam commented
Great, thank you!
lmdu commented
We have fixed it in new version. Thanks!
sanjaysrikakulam commented
Thank you! :-)