/vocaltales_storyteller_chatbot

An all-audio children's storytelling chatbot that leverages OpenAI APIs and gradio. Inspired by Part Time Larry video

Primary LanguagePythonMIT LicenseMIT

VocalTales: Children's Storytelling Audio Chatbot


See the Medium Article for context and more information

NOTE: This does NOT work on mobile iOS devices. See Gradio Issue #2987 for details.

This is a Gradio UI application that takes in a request for a story from the microphone and speaks an interactive Choose-Your-Own-Adventure style children's story. It leverages:

Pricing

WARNING: This application uses paid API services. Create quotas and watch your usage.

At the time of writing, the pricing is as follows:

  • whisper: $0.006 / minute (rounded to the nearest second)
  • gpt-3.5-turbo: $0.002 / 1K tokens
  • Google Text-to-Speech:
    • 0 to 1 million bytes free per month
    • $0.000016 USD per byte ($16.00 USD per 1 million bytes)

Check the links as these can change often. But at the time of writing it costs less than one USD for light use.

Both OpenAI and Google offer free credits for new users.

Setup

Note there are two ways to speak the story: Mac or GCP Text-to-Speech. If using a Mac, the Mac say command is used and that's the easiest/fastest route to running this. It uses the System voice set up in the Accessibility settings. However, if not on a Mac or if you prefer a more realistic voice, the GCP Text-to-Speech may be used. This requires you having (a) a GCP project, (b) the TTS API enabled, and (c) your account authenticated in gcloud (or GOOGLE_APPLICATION_CREDENTIALS environment variable set).

This application has only been tested on a Macbook.

  1. Sign up at OpenAI and acquire an OpenAI API key.
  2. Add to environment variable with: export OPENAI_API_KEY="sk-xxxxxxxxxxxxxxx"
  3. Create virtual environment
  4. Run pip install -r requirements.txt
  5. If on Mac, brew install ffmpeg: brew install ffmpeg
  • Linux may need to install also but untested.
  1. Review and update config in config.py as desired
  2. If using GCP TTS
  3. set in config.py: SPEECH_METHOD = SpeechMethod.GCP
  4. Navigate to the Google API page and enable the API
  5. Confirm you are authenticated in gcloud and your account has access to that API.
  6. Run with: python storyteller.py
  7. Navigate to http://127.0.0.1:7860/ and have fun!

Running as Docker Container

Replace <service-name> with a name of your choice.

  1. Build Docker image: docker build -t <image-name> .
  2. Run locally with something similar to:
docker run -it --rm \
    -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/creds.json \
    -v ${HOME}/.config/gcloud/application_default_credentials.json:/tmp/creds.json \
    -e OPENAI_API_KEY=<openai-api-key> \
    -p <port>:7860 \
    audio-storyteller \
    python storyteller.py \
    --address=0.0.0.0 \
    --port=7860 \
    --username=<username> \
    --password=<password>

Fill in: <openai-api-key>, <port>, and optional <username>:<password>. Then once running, navigate on a browser to 127.0.0.1:` and fill in the optional username:password you provided.

Deploying to Google Cloud Run

  1. Follow the directions above to create a local docker image.

  2. Tag and push (Note: Follow these directions to authenticate)

    docker tag <image-name> gcr.io/<project-id>/<image-name>
    docker push gcr.io/<project-id>/<image-name>
    
  3. Create a service account on your GCP project IAM page named: audio-storytelling-bot@<project-id>.iam.gserviceaccount.com

  4. Deploy with the following command, setting anything in <> appropriately:

    gcloud run deploy audio-storytelling-bot \
        --image gcr.io/<project-id>/<image-name> \
        --platform managed \
        --service-account=audio-storytelling-bot@<project-id>.iam.gserviceaccount.com \
        --set-env-vars=OPENAI_API_KEY=<openai-key-string> \
        --allow-unauthenticated \
        --port=7860 \
        --cpu=1 \
        --memory=512Mi \
        --min-instances=0 \
        --max-instances=3 \
        --command="python" \
        --args="storyteller.py,--address=0.0.0.0,--port=7860,--username=user,--password=storyteller"
    

Cloud Run will automatically scale the number of instances based on the incoming traffic. You can access the deployed Gradio application via the URL provided by the Cloud Run service.