/audio-ai-telegram-bot

A Telegram Bot frontend for processing audio with AI

Primary LanguageGoMIT LicenseMIT

Audio AI Telegram Bot

This is a Telegram Bot frontend for processing audio with several AI tools:

The bot displays the progress (if available) and further information during processing by responding to the message with the prompt. Requests are queued, only one gets processed at a time.

The bot uses the Telegram Bot API. Rendered data are not saved on disk. Tested on Linux, but should be able to run on other operating systems.

Compiling

You'll need Go installed on your computer. Install a recent package of golang. Then:

go get github.com/nonoo/audio-ai-telegram-bot
go install github.com/nonoo/audio-ai-telegram-bot

This will typically install audio-ai-telegram-bot into $HOME/go/bin.

Or just enter go build in the cloned Git source repo directory.

Prerequisites

Create a Telegram bot using BotFather and get the bot's token.

Coqui AI

  • Follow the installations steps and make sure the tts command is available.

  • Create a shell script in the Coqui AI directory with the following contents:

  • Copy the scripts/tts.sh shell script to the repo directory

  • Set this shell script as the TTS binary for the bot using the -tts-bin command line argument.

Whisper

  • Follor the installation steps and make sure the whisper command is available.
  • Copy the scripts/whisper.sh shell script to the repo directory
  • Set this shell script as the STT binary for the bot using the -stt-bin command line argument.

MDX23v2

  • Clone the MDX23v2 repo
  • Enter into the cloned directory
  • python3 -m venv env
  • pip install -r requirements.txt
  • Copy the scripts/mdx.sh shell script to the repo directory
  • Set this shell script as the MDX binary for the bot using the -mdx-bin command line argument.

RVC WebUI

  • Clone the RVC WebUI repo
  • Enter into the cloned directory
  • python3 -m venv env
  • pip install -r requirements.txt
  • Copy the scripts/rvc.sh shell script to the repo directory
  • Set this shell script as the RVC binary for the bot using the -rvc-bin command line argument.
  • Set the RVC model path directory using the -rvc-model-path command line argument. This is usually located at rvc/assets/weights

Audio WebUI

  • Clone the Audio WebUI repo
  • Follow the installation instructions
  • Copy the scripts/rvc-train.py and scripts/rvc-train.sh to the Audio WebUI directory
  • Set the rvc-train.sh shell script as the RVC train binary for the bot using the -rvc-train-bin command line argument.

AudioCraft

  • Follow the installation steps

Musicgen

  • Set the scripts/musicgen.sh shell script as the Musicgen binary for the bot using the -musicgen-bin command line argument.

Audiogen

  • Set the scripts/audiogen.sh shell script as the Audiogen binary for the bot using the -audiogen-bin command line argument.

Running

You can get the available command line arguments with -h. Mandatory arguments are:

  • -bot-token: set this to your Telegram bot's token
  • -tts-bin: path of the TTS binary -stt-bin: path to the STT binary -mdx-bin: path to the MDX binary -rvc-bin: path to the RVC binary -rvc-model-path: path to the RVC weights directory -musicgen-bin: path to the Musicgen binary -audiogen-bin: path to the Audiogen binary

Set your Telegram user ID as an admin with the -admin-user-ids argument. Admins will get a message when the bot starts.

Other user/group IDs can be set with the -allowed-user-ids and -allowed-group-ids arguments. IDs should be separated by commas.

You can get Telegram user IDs by writing a message to the bot and checking the app's log, as it logs all incoming messages.

All command line arguments can be set through OS environment variables. Note that using a command line argument overwrites a setting by the environment variable. Available OS environment variables are:

  • BOT_TOKEN
  • ALLOWED_USERIDS
  • ADMIN_USERIDS
  • ALLOWED_GROUPIDS
  • TTS_BIN
  • TTS_DEFAULT_MODEL
  • STT_BIN
  • MDX_BIN
  • RVC_BIN
  • RVC_MODEL_PATH
  • RVC_DEFAULT_MODEL
  • RVC_TRAIN_BIN
  • RVC_TRAIN_DEFAULT_BATCH_SIZE
  • RVC_TRAIN_DEFAULT_EPOCHS
  • MUSICGEN_BIN
  • AUDIOGEN_BIN

Supported commands

  • /aaitts (-m [model]) [prompt] - text to speech
  • /aaitts-models - list text to speech models
  • /aaistt (-lang [language]) - speech to text
  • /aaimdx (-f) - music and voice separation (-f enables full output including instrument and bassline tracks)
  • /aairvc (model) (-m [model]) (-p [pitch]) (-method [method]) (-filter-radius [v]) (-index-rate [v]) (-rms-mix-rate [v]) - retrieval based voice conversion
  • /aairvc-train (model) (-m [model]) (-method [method]) (-batch-size [v]) (-epochs [v]) (-delete) - retrieval based voice conversion training
  • /aairvc-models - list rvc models
  • /aaimusicgen (-l [sec]) [prompt] - generate music based on given audio file and prompt
  • /aaiaudiogen (-l [sec]) [prompt] - generate audio
  • /aaicancel - cancel current req
  • /aaihelp - show this help

You can also use the ! command character instead of /.

You don't need to enter the /aaitts command if you send a prompt to the bot using a private chat.

Donations

If you find this bot useful then buy me a beer. :)