Advanced Voice Recognition and Tagging System for Multi-Speaker Audio Files
Closed this issue · 6 comments
This enhancement will be particularly beneficial for transcribing meetings, interviews, gaming sessions, and podcasts involving multiple speakers, enabling users to distinguish who is speaking at any given time easily.
The Speaker Recognition/Diarization is top on the todo list - after we finish some new Assistant features.
What do you mean by "Advanced"? And how do you see the "tagging system"?
Cheers!
Mainly these:
Advanced Speaker Recognition: Utilizes high-end technology for precise identification of individual speakers in complex audio.
Tagging System: Automatically labels audio segments with speaker names for easy tracking in recordings.
I've added Speaker Detection via Ingest or Transcription window with version 0.23.0
As I mentioned in the docs, the model is basic, but it does try to match speakers throughout the transcription. A more advanced implementation will come at some point!
Cheers
Amazing! Thank you! Do I need to re-download to get this feature? I didn't see a way to directly update to version 0.23.0 from Storytoolkit
Right now version 0.23.0 is available by installing from source
A standalone release is coming in a few days as an early release for Patreon members, but we'll make it available publicly on the next version release.
Cheers!