Create virtual environment and activate it

python -m venv .venv
.venv/Scripts/activate

Install Dependencies

pip install -r requirements.txt

Run the program

python main.py

TODO list:

  1. Keep a transcript of all the things the user has heard (filter out the garbage)
  2. Try using an open source TTS (or stick to elevenlabs).
  3. Use voice converter? or is it too heavy in resources? I want Lilas Ikuta for Japanese and Kafka for english.
  4. Plan out the hardware required to support teh software (this focuses on being lightweight, this is not AI, it is solely API)
  5. Implement a deafen feature that only activates when you flick a switch or press a button
  6. Find a translator API