Audio-Segmentation-and-Analysis-for-call

Audio Transcription with Whisper ASR

This code demonstrates how to transcribe audio segments using the OpenAI Whisper ASR model. It covers the following steps:

Audio Download:
- The code starts by downloading an audio file from Google Drive using its file ID.
Audio Playback:
- The downloaded audio file is played using IPython to verify its content.
Audio Splitting:
- The audio is split into two segments: caller and callee, each containing one half of the conversation.
Whisper ASR Model:
- The Whisper ASR model is loaded, and audio segments are transcribed using the model.
VTT File Generation:
- The transcriptions are saved in WebVTT (VTT) format, suitable for captioning and subtitles.
Conversation Data Storage:
- Transcriptions and audio segments' paths are stored in a JSON file for further analysis or use.

Before running this code, make sure you have the necessary packages installed:

You can install these packages using pip:

pip install pydub requests openai whisper