whisperChat: A JavaScript repository from kuntal-c

📝 Table of Contents

About
Architecture
Getting Started
Deployment
Usage
Built Using
Authors
Acknowledgments
Want to Contribute?

🧐 About

WhisperChat is an Open Source application that allows communication with OpenAI's ChatGPT and comes with additional features, such as:

Fully voiced conversation, using OpenAI's whisper API for STT, as well as Google's TTS.
The ability for chatGPT to retain memory of past conversations, via Pinecone + langchain.
- NOTE: Using this feature can and will incrementally incur larger costs, as the summary of prior conversations is generated and sent along with the user prompt in the same request.
Initialization with a directive, to instruct chatGPT to talk in different ways and assume different personas.
Basic account creation, or usage via guest profile.

🏗️ Architecture

Notes:

Pinecone can be turned off via feature flag in .env of the backend Service.

🏁 Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

Software/Libraries:

- Docker 4.16 with Docker Compose V2 enabled
- Node.js 18+
- React.js 18+
- MongoDB 5.0+ (can be run in Docker)

Credentials:
- OpenAI API key (You can get some free credits upon account creation)
- Google Cloud TTS API key (You get some allowance with a free account creation)
- Pinecone API key (There maybe a waitlist for a free account)

Installing

Clone the repository:

git clone https://github.com/athrael-soju/whisperChat.git'

Create API keys for:

- OpenAI API key
- Google Gloud Service Account JSON credentials
- Pinecone API key

Set Environment Variables for each service:

Rename frontend/.env.local to frontend/.env and set the values:

# OpenAI
OPENAI_API_KEY='YOUR_API_KEY'
# Backend
SERVER_PORT=5000 - Adjust as needed
SERVER_ADDRESS='http://localhost' - Adjust as needed
SERVER_MESSAGE_ENDPOINT='/message'
SERVER_SPEAK_ENDPOINT='/speak'
SERVER_LOGIN_ENDPOINT='/auth/login'
SERVER_REGISTER_ENDPOINT='/auth/register'
SERVER_GUEST_ENDPOINT='/auth/guest'
AUDIO_DB_SENSITIVITY='-55' - Adjust as needed

Rename backend/.env.local to backend/.env and set values:

NODE_ENV="dev"
SERVER_PORT=5000 - Adjust as needed
# OpenAI
OPENAI_ENABLED=true - Setting this to false will respond with a generic message. Used for testing.
OPENAI_API_KEY="YOUR_API_KEY"
OPENAI_API_MODEL="gpt-3.5-turbo"
# Model Load Parameters
DIRECTIVE_ENABLED=false
MODEL_DIRECTIVE="directive" - Choose a directive from the list of directives in the backend/src/data folder
# Google Cloud TTS
GOOGLE_CLOUD_TTS_LANGUAGE="en-US" - Adjust as needed
GOOGLE_CLOUD_TTS_NAME="en-US-Neural2-J" - Adjust as needed
GOOGLE_CLOUD_TTS_GENDER="MALE" - Adjust as needed
GOOGLE_CLOUD_TTS_ENCODING="MP3" - Adjust as needed
# DB & Cache
MONGO_URI="mongodb://admin:secret@mongodb:27017/myapp?authSource=admin" - Adjust as needed
# Secrets
JWT_SECRET="secret"
# Pinecone Vector Search
PINECONE_ENABLED=false - Adjust as needed
PINECONE_API_KEY="YOUR_API_KEY"
PINECONE_ADDRESS="http://pinecone"
PINECONE_PORT=4000 - Adjust as needed
PINECONE_TOPK=5 - Adjust as needed
PINECONE_THRESHOLD=0.95 - Adjust as needed

If you choose to use Pinecone, Rename pinecone/.env.local to pinecone/.env and set values:

# OpenAI
OPENAI_API_KEY="YOUR_API_KEY"
# Pinecone Vector Search
PINECONE_API_KEY="YOUR_API_KEY"
PINECONE_ADDRESS="http://pinecone" - Adjust as needed
PINECONE_PORT=4000 - Adjust as needed
PINECONE_ENVIRONMENT="YOUR_PINECONE_ENV"
PINECONE_NAMESPACE="default" - Adjust as needed
PINECONE_INDEX="whisper-index" - Adjust as needed

Additionally, replace backend/credentials/google.api.local.json with backend/credentials/google.api.json and copy/paste your google cloud JSON credentials there

Start Mongodb Docker container (if you don't have it installed locally):
```
docker run --name mongodb -d mongo:latest
```
Run npm install in each service folder (frontend, backend, pinecone):
```
cd frontend / backend / pinecone
npm install
```

Alternatively, you can run them all with Docker (after running npm start once in the frontend to init the env.js file)

docker-compose up --build -d using the docker-compose.yml file
docker-compose up --build -d frontend for frontend only
docker-compose up --build -d backend for backend only
docker-compose up --build -d pinecone for pinecone only

You should be able to access the application at http://localhost:3000 (or whichever port you set in the frontend.env file)

🎈 Usage

Once deployed, login as guest, or create a basic account. Voice Chat:
- Record allows the user to initiate continuous discussion.
- Pause will pause recording, but pressing Record again will resume it.
- Stop will stop the ongoing discussion. Text Chat:
- This can be achieved by sending a request to the endpoints directly, via Postman.
- A sample POST message can be send to localhost:5000/message and contain form-data (username, message) and the response will be returned to the body.

🚀 Deployment

No Deployments currently available.

⛏️ Built Using

Docker - Containerization and deployment.
ReactJs - Web Framework for frontend service.
NodeJs - Server Environment for backend and pinecone services.
OpenAI Whisper API - ChatGPT and Whisper model integration for chatbot functionality.
Google TTS - Converts text into natural-sounding speech in a variety of languages and voices.
Whisper Hook by chengsokdara - React Hook for OpenAI Whisper API with speech recorder, real-time transcription and silence removal functionality.
Langchain - Framework for developing applications powered by language models.

✍️ Authors

@athrael-soju - Idea & Initial work

🎉 Acknowledgements

Whisper Hook by chengsokdara

Want to Contribute?

Fork the repo
Make your changes
Submit a pull request
I'll review it and merge it

kuntal-c/whisperChat