Language mismatch in transcription
felipe-pereira opened this issue · 2 comments
This is more of a question than an issue.
If we set the language parameter to "en"
and the media language is "es"
(or any other language, the conversation takes place in that lang alone), do we get poor confidences or an actual error?
Maybe one from this list?
"internal_processing" "download_failure" "duration_exceeded" "duration_too_short" "invalid_media" "empty_media" "transcription" "insufficient_balance" "invoicing_limit_exceeded"
Thanks!
Hi There Felipe,
I'm Luke, a Solutions Engineer here with Rev, and I'll hopefully be able to help you with this.
Right now, our API will not throw an error if you attempt recognition with a language mismatch. It will attempt recognition using the language you specify, and the confidence scores will be quite low (and the transcript will likely be pretty much word salad, as it attempts to match the acoustics of your source media with words in a different language).
We are currently developing a language id service, which will attempt to identify the language in an input media file and select the appropriate language; I expect we will deploying this before the end of the year.
I hope that helps, if you have further questions feel free to reach out to me directly at luke.gottlieb@rev.com
Hi Luke, perfect, I understand, thanks for the reply.