davabase/whisper_real_time

Recieve sound data from any application along with the mic in Linux. PR or use your code?

Opened this issue · 1 comments

Hi, I wrote a hook that allows sound data from any application with active sound(or several applications simultaneously) together with the mic data, to be streamed to Whisper. I'm using it for application-agnostic live-transcribtion/LLM "real time" assistance application.

  • It's not that pretty: I use third party sound libs for Linux(PulseAudio) for virtual sound device creation, using bash script. Then the user manually redirect the sound of the app/mic into this virtual device, using PulseAudio GUI.
  • I wrote a small implementation of sr.AudioSource abstract class, with audio stream from PulseAudio, what allowed me easily connect to whisper and enjoy all the sr features like background listening, sound adjustment etc..

Now I'm ready to push the code, and I wonder if(and how) should I address you, or should I create a PR to add my hook, after I'll prettify and test it.

Thanks for sharing your code, it's the best I tried for real time Whisper usage.

The point of this project is to remain small and educational so that other people can use it as a jumping off point to make their own projects, therefore I generally do not accept PRs that add additional functionality. The code itself is public domain and can be used in any project for any purpose.

My recommendation would be to start a new project, either by forking this one or starting from scratch and copying the code that you want, and then share your results with the rest of the community on the Whisper discussion forums.

I do appreciate the kind words and I am glad you found this project useful!