VRChat AI Assistant

OpenAI GPT-4 powered AI Assistant with a GUI that integrates with VRChat using OSC. This program is currently in an "it works on my machine" state, and will most likely not work on yours without a ton of tinkering. For example, it relies on VB-Audio VoiceMeeter Banana to play audio over the microphone. Either way, I'm uploading this just to have it up here.

Usage

Run either start_assistant.ps1 or .bat, which will automatically activate the virtual environment and start the program. If you for whatever reason are not using a virtual environment, just run python assistant.py.

The program will start listening when it detects either the parameters ChatGPT or ChatGPT_PB get triggered on your avatar. For example, you could trigger it either from the Action Menu, or using a Contact Sender/Receiver pair. Alternatively, double-tap the Right Control key to invoke it manually. Voice gets transcribed to text with Faster Whisper, which is forwarded to OpenAI, and the response is read out with Google Cloud TTS or optionally one of 11.ai voice synthesis, Google Translate, or Windows Default TTS. The response text is also fed into the VRChat Chatbox.

System commands are triggerable by saying "System" and the name of the command, which will also bypass sending it to OpenAI.

Installation

Copy .env.example to .env, get your API keys from OpenAI and from ElevenLabs, and put them in the file. Get your Google Cloud Authentication file and put it in the project directory, then add the path to it in .env.

Activate a virtual environment in the .\venv folder using python -m venv venv. This can be skipped, but is recommended to not conflict with globally installed packages. Install CUDA Toolkit and cuDNN and add their respective \bin folders to your PATH if you plan to use the GPU. Install the required Python packages listed below using pip. With GPU support, you may need to install the latest nightly version of PyTorch, or uninstall and re-install if you have an old version that doesn't work and/or wasn't compiled with CUDA support. An example command for installing PyTorch nightly on Windows using pip with CUDA 11.8 support is as follows:

pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu118

Requirements

Python 3.8 or higher with Pip. Highly recommended to use a venv.

There were recent breaking changes to the ElevenLabs library, for now you can force an old version with pip install --force-reinstall "elevenlabs==0.1.1"

Required libraries: audioop, python-dotenv, elevenlabs v0.1.1, faster-whisper, ffmpeg, google-cloud-texttospeech, gtts, openai, pynput, python-osc, pyttsx3, and customtkinter

Most likely requires an NVidia GPU. Not tested with AMD, but I doubt it will work. In that case, edit the file to use CPU instead of CUDA. To use Faster Whisper, you need both cuDNN and CUDA Toolkit 11.8 in PATH. Otherwise, use OpenAI Whisper or use CPU inference.

The following files need to be copied over from C:\Windows\Media as I can't upload them to Github due to them being owned by Microsoft:

Speech Misrecognition.wav
Speech Off.wav
Speech On.wav
Speech Sleep.wav

Copyright

The contents of this repository, including all code, documentation, and other materials, unless otherwise specified, are the exclusive property of MissingNO123 and are protected by copyright law. Unauthorized reproduction, distribution, or disclosure of the contents of this repository, in whole or in part, without the express written permission of MissingNO123 is strictly prohibited.

The original version of the Software was authored on the 17th of March, 2023.