Samagra-Development/ai-tools

Documenting how to access Bhashini Speech to text

Closed this issue · 0 comments

Documentation for using Bhashini models is provided here

You need to do the following :

  • Create an account on Bhashini ULCA
  • Sign in and create an API key
  • Convert your audio file into 'base64' string format
  • Send to the base 64 to the API and use the output

I have also created a collab here with an example of the same. You need to provide your own API key in the collab

  1. Sign up here
  2. Fill out the registration form.
  3. Complete email authentication to enable login functionality.
  4. Log in using your authenticated email.
  5. Open the “My Profile” section.
image
  1. Create an API Key using the “Generate” button under the “My Profile” section. Ensure that your app name uses lowercase words and underscores.
  2. Use the API provided in my collab to convert wav file to bas64 and run
  3. In the collab, I have combined the pipeline APIs that is used to get the authorization and 'model to hit' with the ASR model to quickly run ASR. I have also added batching to enable it to run for bigger wav files