Early stopping inference
JEF1056 opened this issue · 1 comments
JEF1056 commented
Shouldn't there be a function that allows the user to stop inference? Could be implemented as a callback function just like in whisper.rn's realtimeInference()
jhen0409 commented
context.stopCompletion()
is a way to stop inference.
This was simply designed to perform only one completion on the context at the same time, but now we're able to do parallel decoding, so that may change in the future.