mobiusml/aana_sdk

[FEATURE REQUEST] Add optional speaker information with whisper transcription

Closed this issue 3 months ago · 0 comments

Jiltseb commented 4 months ago

Feature Summary

This feature will add extra processing to take the output from diarization and whisper deployments and combine them together to form transcription with speaker information.

Justification/Rationale

Diarized transcription adds additional speaker information and this helps the LLM to infer speaker-related information and associate it with the transcription

Proposed Implementation (if any)

Add speaker processor that has speaker-related utility functions. One utility function should handle post-processing for diarized transcription. Since this is an app with multiple deployments, a notebook needs to be created to show how it works.