/speech-to-text-code-pattern

React app using the Watson Speech to Text service to transform voice audio into written text.

Primary LanguageJavaScriptApache License 2.0Apache-2.0

Build Status

Speech to Text Code Pattern

Sample React app for playing around with the Watson Speech to Text service.

Demo: https://speech-to-text-code-pattern.ng.bluemix.net/

architecture

Flow

  1. User supplies an audio input to the application (running locally, in the IBM Cloud or in IBM Cloud Pak for Data).
  2. The application sends the audio data to the Watson Speech to Text service through a WebSocket connection.
  3. As the data is processed, the Speech to Text service returns information about extracted text and other metadata to the application to display.

Steps

  1. Provision Watson Speech to Text
  2. Deploy the server
  3. Use the web app

1. Provision Watson Speech to Text

Note: You can skip this step if you will be using the Deploy to Cloud Foundry on IBM Cloud button below. That option automatically creates the service and binds it (providing its credentials) to the application.

The instructions will depend on whether you are provisioning services using IBM Cloud Pak for Data or on IBM Cloud.

Click to expand one:

IBM Cloud Pak for Data

Install and provision

The service is not available by default. An administrator must install it on the IBM Cloud Pak for Data platform, and you must be given access to the service. To determine whether the service is installed, click the Services icon (services_icon) and check whether the service is enabled.

Gather credentials

  1. For production use, create a user to use for authentication. From the main navigation menu (☰), select Administer > Manage users and then + New user.
  2. From the main navigation menu (☰), select My instances.
  3. On the Provisioned instances tab, find your service instance, and then hover over the last column to find and click the ellipses icon. Choose View details.
  4. Copy the URL to use as the SPEECH_TO_TEXT_URL when you configure credentials.
  5. Optionally, copy the Bearer token to use in development testing only. It is not recommended to use the bearer token except during testing and development because that token does not expire.
  6. Use the Menu and select Users and + Add user to grant your user access to this service instance. This is the SPEECH_TO_TEXT_USERNAME (and SPEECH_TO_TEXT_PASSWORD) you will use when you configure credentials to allow the Node.js server to authenticate.
IBM Cloud

Create the service instance

  • If you do not have an IBM Cloud account, register for a free trial account here.
  • Click here to create a Speech to Text instance.
    • Select a region.
    • Select a pricing plan (Lite is free).
    • Set your Service name or use the generated one.
    • Click Create.
  • Gather credentials

If you need to find the service later, use the main navigation menu (☰) and select Resource list to find the service under Services. Click on the service name to get back to the Manage view (where you can collect the API Key and URL).

2. Deploy the server

Click on one of the options below for instructions on deploying the Node.js server.

local openshift cf

3. Use the web app

  • Select an input Language model (defaults to English).

  • Press the Play audio sample button to hear our example audio and watch as it is transcribed.

  • Press the Record your own button to transcribe audio from your microphone. Press the button again to stop (the button label becomes Stop recording).

  • Use the Upload file button to transcribe audio from a file.

architecture

Developing and testing

See DEVELOPING.md and TESTING.md for more details about developing and testing this app.

License

This code pattern is licensed under the Apache License, Version 2. Separate third-party code objects invoked within this code pattern are licensed by their respective providers pursuant to their own separate licenses. Contributions are subject to the Developer Certificate of Origin, Version 1.1 and the Apache License, Version 2.

Apache License FAQ