No longer diarizes
Opened this issue · 4 comments
tristan-mcinnis commented
yinruiqing commented
This token is deactivated. You can use your own token.
tristan-mcinnis commented
have changed the HF token to my own in the /cli/transcribe.py file...
And used the example code:
python -m pyannote_whisper.cli.transcribe data/afjiv.wav --model tiny --diarization True
Still doesn't work? Am i missing something?
tristan-mcinnis commented
import whisper
from pyannote.audio import Pipeline
from pyannote_whisper.utils import diarize_text
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
use_auth_token="hf_xxxxx -replace with my own")
model = whisper.load_model("tiny.en")
asr_result = model.transcribe("data/afjiv.wav")
diarization_result = pipeline("data/afjiv.wav")
final_result = diarize_text(asr_result, diarization_result)
for seg, spk, sent in final_result:
line = f'{seg.start:.2f} {seg.end:.2f} {spk} {sent}'
print(line)
The code in the readme also doesn't work.
wagesj45 commented
@nexuslux Have you affirmed access through Huggingface repositories? You'll need to agree to the terms for each of the repositories pyannote uses. That would be pyannote/segmentation and pyannote/speaker-diarization.