google/uis-rnn

Test modell

Closed this issue · 1 comments

Sorry if this sounds like a dumb question. I am not an expert in eighter python or speaker diarization. After I have trained the model, how can I use it to determine how is speaking from a wave file. I am trying to determine how is speaking from a one audio telephone conversation.

Could I for example use test_test_sequence=wavfile.read(mywav) as a input to
predicted_cluster_id = model.predict(test_sequence, args), and get get a prediction of how spoke from this file?

My question is more about the use of the code. I hope you can help!

  1. The input must be speaker-discriminative embeddings computed using another library, not raw waveform signals.
  2. UIS-RNN must be trained before you can use it for prediction.
  3. Please read the paper, or at least the README.md file first.