Allows users to summon and control Chat GPT with their voice!
First install the dependencies via the npm install
command.
Then obtain a free Porcupine AccessKey from https://console.picovoice.ai/, and add it as the ACCESS_KEY
environment variable in a .env
file in the root of the project.
On the first npm start
run, you will have to manually login to Chat GPT, but additional runs will use the same session.
This does mean that if you don't use HeyChatGPT for a while, you might have to login again.
When you've been taken to the ChatGPT site, there are various Voice commands you can use to interact with the page:
- Thanks GPT
- Sends the currently typed out message
- Goodbye GPT
- Minimizes the ChatGPT window and stops listening for commands
- Be Quiet GPT
- Stops the current text-to-speech
- Stop GPT
- Stops the generation of the current response
Additionally there are various commands added to make crafting messages easier:
- Backspace
- Remove the latest word
- Clear
- Clears the text
- Comma
- Adds a comma
- Dot/Period
- Adds a period
- Question Mark
- Adds a question mark
- Exclamation Mark/Point
- Adds an exclamation point
- New Line
- Adds a new line
If you wish to only use HeyChatGPT to summon Chat GPT, you can pass the --summon-only
flag via npm run start -- --summon-only
command.
Additionally you can disable the text-to-speech via the --be-quiet
flag via npm run start -- --be-quiet
command.
The Wake word is currently set to "Hey Chat GPT", but is specifically trained on my voice, which will have mixed results for other voices.
Thankfully one can train their own wake word using the Porcupine Console, and then use the --keyword-paths
flag to specify the path(s) to the .ppn
file - separated by comma if you wish to use multiple.
If you're fine with one of the builtin wake words, you can use the --keywords
flag to specify as many wake words - separated with commas - as you wish, which can be one of the following:
ALEXA
AMERICANO
BLUEBERRY
BUMBLEBEE
COMPUTER
GRAPEFRUIT
GRASSHOPPER
HEY_GOOGLE
HEY_SIRI
JARVIS
OK_GOOGLE
PICOVOICE
PORCUPINE
TERMINATOR
If you use the
--keywords
flag, you will have to explicitly specify the--keyword-paths
flag if you wish to use both.
Both of these are additionally customizable via environment variables, with the KEYWORDS
and KEYWORD_PATHS
environment variables respectively.
The first time you run HeyChatGPT, it will ask you which input device to use for input, and will remember your choice for future runs by saving it to the .env
file.
This can be changed at a later date by deleting the AUDIO_INPUT_DEVICE
environment variable from the .env
file.
Tech used: JavaScript, Porcupine, Chrome, Web Speech API, Puppeteer
Porcupine is used to detect the wake word, then ChatGPT is opened in Chrome, and finally a userscript is injected into the page using the Web Speech API to allow the user to enter and interact with Chat GPT using their voice.
Currently the Web Speech API is only supported in Chrome, at the cost of the audio being sent to Google's servers for processing.
Improvements to the voice command UI are planned, giving the users the ability to customize the commands, along with additional commands to switch between conversations and such.
Additionally a visual UI is planned to allow users to see the current state of HeyChatGPT, allowing them to interact with the new features without their voice.