Voice Assistant

A simple voice assistant that uses vosk for speech-to-text and silero for text-to-speech. The assistant can be activated using a wake word detection system from picovoice. It can be easilty extended to include more features like weather, news, etc.

Wake word detection
Configurable commands that can be set using simple .json file
Possibility of executing shell commands or any other scripts you want
Easy ChatGPT integration
Text-to-speech using open source multilanguage pretrained silero models

Prerequisites

`vosk` model for STT

Go to https://alphacephei.com/vosk/models and choose the model you want to use
I recommend to first try small models like vosk-model-small-en-us-0.15 or vosk-model-small-en-us-0.4
Download the model and extract it to the project folder
Add the path to the model in the .env file

`picovoice` for wake word detection

go to https://picovoice.ai/platform/porcupine/
register and create a wake word for the assistant
Grab the API key and add it to the .env file to PICOVOICE_API_KEY variable
Download the .ppn file and add it to the project folder
Add the path to the .ppn file in the .env file to PICOVOICE_KEYWORD_PATH variable

`silero` for TTS

Go to the silero repo and choose the model you want to use
Add needed variables to the .env file
Cache for the model will be created after the first launch in the .cache/torch/hub/snakers4_silero-models_master folder

Installation

Clone the repository
Install poetry using pip install poetry
Run poetry install to install the dependencies
Create a .env file. You can use the .env.example file as a template
Follow the instructions above to prepare the vosk model and picovoice wake word detection
Run the assistant using poetry run python main.py
Say the wake word and ask the assistant to tell you current time
You are all set!

P.S. You can add more commands following this guide

Adding new commands

Simple commands

Create a directory in the commands folder with the name of the command
Create a config.json file with the following structure:

{
  "commands": [
    {
      "name": "hello",
      "action": "voice",
      "aliases": [
        "hey there",
        "hi",
        "hello"
      ],
      "responses": [
        "Hello, how can I help you?"
      ]
    }
  ]
}

That's basically it. You can add more commands to the commands array. The assistant will randomly choose one of the responses from the responses array

Advanced commands

Create a directory in the commands folder with the name of the command
Create a config.json file with the following structure:

{
  "commands": [
    {
      "name": "open_browser",
      "action": "script",
      "aliases": [
        "open browser",
        "run browser",
        "open google",
        "open chrome"
      ],
      "params": {
        "url": "https://www.google.com"
      }
    }
  ]
}

The action field should be set to script if you want to execute a script or a shell command
Create a script.py file in the command directory
It should contain a function with name script that looks like this:

def script(*args, **kwargs) -> None:
    ...

You can add a depends_on property to the command, so it can use another command's script. In this case you don't need to create a separate folder and a new config.

For example, you can add a open_youtube command that depends on open_browser command.

This is useful, because you don't need to write the same script multiple times. The open_youtube command can just call the open_browser script and then open youtube. The config.json file should look like this:

{
  "commands": [
    {
      "name": "open_youtube",
      "action": "script",
      "aliases": [
        "open youtube",
        "run youtube"
      ],
      "depends_on": "open_browser",
      "params": {
        "url": "https://www.youtube.com"
      }
    }
  ]
}

!NOT TESTED! You can use any structure you want inside the command directory. But the main script should be named script.py and contain a function script that will be executed when the command is called.

ChatGPT integration

Go to https://platform.openai.com/api-keys and create an API key
Add the API key to the .env file to the OPENAI_API_KEY variable
Add the chat_gpt command to the commands folder
The config.json file should look like this:

{
  "commands": [
    {
      "name": "chat_gpt",
      "action": "chat_gpt",
      "aliases": [
        "tell me",
        "search for"
      ]
    }
  ]
}

Possible issues

Current implementation of ChatGPT is not perfect. It only works with the success responses etc. Will be improved ASAP
You may face a problem with incorrect input device. Check device_index in the _init_recorder function in the speech_to_text.py file
Commands open_browser and open_youtube logs WARNING messages, because of the incorrect response of the script execution. This will be fixed in the future
Any other issues can be reported in the issues section. I will try to help you as soon as possible

m3nd0r/voice_assistant