yinruiqing/pyannote-whisper

Should Diarization take this long?

SLong97 opened this issue · 2 comments

Hi, How can I improve the diarization time? Currently 40+ seconds is taking nearly 3 minutes (see times below)

Load Model
2023-02-09 17:24:42.884215

Transcribe
2023-02-09 17:24:43.236359

Diarize
2023-02-09 17:24:48.702966

Combined
2023-02-09 17:27:40.335563

0.00-4.00 SPEAKER_02  Thanks to our partner Square, everything your business needs.
4.00-10.00 SPEAKER_02  Like payments, point of sale, e-commerce, inventory management, charging endlessly about the weather?
10.00-12.00 SPEAKER_02  Well, you're on your own there.
14.00-16.00 SPEAKER_01  Square, everything your business needs.
16.00-20.00 SPEAKER_01  Almost. Visit Square.com.
20.00-30.00 SPEAKER_00  This is an Irish Independent Podcast.
30.00-34.00 SPEAKER_03  I'm Adrian Wacler and this is the Big Tech Show.
34.00-37.00 SPEAKER_03  Is your traditional office doomed?
37.00-50.00 SPEAKER_03  Dropbox founder and CEO Drew Huyerson thinks so, and was in double in the week to show off the company's new revamped virtual first office in the series.

Here is my Python code, any help in optimizing it or even implementing a better solution would be greatly appreciated, cheers.

from pyannote.audio import Pipeline
import whisper
from pyannote_whisper.utils import diarize_text

pipeline = Pipeline.from_pretrained("Models/config.yaml")
tiny = "Models/tiny.pt"
audio = "Audio/WAV-CLIP-The-Big-Tech-Show_Dropbox-CEO.wav"

model = whisper.load_model(tiny)
asr_result = model.transcribe(audio)
diarization_result = pipeline(audio)
final_result = diarize_text(asr_result, diarization_result)

for seg, spk, sent in final_result:
    line = f'{seg.start:.2f}-{seg.end:.2f} {spk} {sent}'
    print(line)

Are you running on GPU or CPU? For CPU, I experienced that it can take that long.

Are you running on GPU or CPU? For CPU, I experienced that it can take that long.

Yes on CPU, disappointing it takes that long, but understandable.