akras14/speech-to-text

[HELP], Want to recognize the voice

IAmVinnnn opened this issue · 5 comments

I have used google cloud speech to text API which is working well but I need to show speakers just above the line. Suppose I have an audio in which 4 persons involved Now I want to get the persons just before start his / her text. Like
Person1:
Here is the text of person1.
Person2:
Here is the text of person2.
Person1:
Here is another line of text from person1.
Person3:
Here is the text of person3.
Can anyone let me know how I can get the speaker also with the text by using google API?

Unfortunately, I don't think this is currently possible.

I'll leave the issue open for now, in case somebody might have a better suggestion.

@akras14 Thanks for the response, Let see if anyone have any solution regarding this.

I am going to close this issue, Because now I am going with IBM speech to text API and it's working as per my requirements. You guys can check HERE

It's currently available, but it's in beta. Since this project is using the SpeechRecognition module instead of the google api directly, I'm not sure if you can modify the setting, but it's just two added fields in the request.

config = speech.types.RecognitionConfig( encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16, sample_rate_hertz=16000, language_code='en-US', enable_speaker_diarization=True, diarization_speaker_count=2)

Google Cloud Speech Diarization

Can we change video file to text