tazz4843/whisper-rs

Current implementation/wrapping of Context and State seem to prevent transfer between threads/tasks

Closed this issue · 4 comments

Caveat: It may be my Rust skills....

It seems because of the lifetime links between WhisperContext and WhsiperState it is not possible to call Whisper in stateful blocks in a separate thread. I seem to hit lifetime issues when creating a common struct encapsulating both or pass them in (and back) from a separate thread.

This is a use case when bridging async and blocking code. You create the WhisperContext and WhisperState in an async block, read in a segment of audio, then call the blocking code from an async spawn_blocking call, once complete it returns to the async block, extracts segments, reads in another block of audio etc. rinse and repeat...

I figure you're using tokio::task::spawn_blocking here? In that case, yeah there's not really a great way to do it, as it expects a lifetime of 'static on everything. However, we can't change the lifetimes on the WhisperState, as it could result in UB if the model itself happens to get dropped before the State.
My workaround has been just storing a single model in a global static (using a OnceCell) where it's known to live for 'static and then being able to use tokio::task::spawn_blocking as the lifetime requirements are now satisfied.

Yes, that is correct - pretty much what I am trying to do.

Do you have any examples on this? might be an idea (nice to have) of having an example in the src - I would imagine I am not the only one wanting to non block or at least parallelise multiple invocations.

Sorry it took me a while to get back to you here. Currently the only real example I have is my personal use case over at https://github.com/scripty-bot/stt-service/blob/master/stts_speech_to_text/src/lib.rs

The code's a bit spaghetti, but the relevant parts are static MODEL, fn load_models, fn get_new_model, and SttStreamingState::finish_stream. They should hopefully roughly show an idea of how to use it, but I will look into adding an example. I have one half made on my desktop right now, just need to clean it up and push it.

Any thoughts about wrapping the Context in an Arc and storing that in the State struct?

pub struct WhisperInnerContext {
    ctx: *mut whisper_rs_sys::whisper_context,
}

impl Drop for WhisperInnerContext { ... }

pub struct Context { 
   inner: Arc<InnerContext> 
}

pub struct State {
   ctx: Arc<InnerContext>,
   ptr: *mut whisper_rs_sys::whisper_state,
}

impl Drop for State { ... }

This a pattern I've seen elsewhere, for instance the rusb crate. Removing the lifetime parameter in State would make it much easier to use.