lmdu/pyfastx

[BUG] Unable to get sequences after encountering `KeyError`

joverlee521 opened this issue · 3 comments

I'm trying to access sequences with a try/except block like so:

# sequences.fasta contains three sequences: 'seq_a', 'seq_b', 'seq_c'
sequences = pyfastx.Fasta('sequences.fasta')
for seq_id in ['seq_a', 'bad_sequence', 'seq_a', 'seq_b', 'seq_c']:
    try:
         sequence_record = sequences[seq_id]
     except KeyError:
         ...

Once I encounter a KeyError, all subsequent tries to get a sequence fails with a KeyError.
In my example, I am able to access seq_a the first time, but I get an error for any of the sequences after bad_sequence.

Python version 3.7
pyfastx version 0.8.4

I'm not super familiar with C or the SQLite C interface, so I'm making a guess from my brief reading of the docs.
It seems like there needs to be a reset with sqlite3_reset(self->seq_stmt); within the else block that raises the KeyError:

pyfastx/src/index.c

Lines 536 to 539 in c5023a7

} else {
PyErr_Format(PyExc_KeyError, "%s does not exist in fasta file", name);
return NULL;
}

Yeah, that was my assessment too. Same issue exists with integer-based indexing as well. I'm preparing a new test case and patch.

#51 works for me locally.