Summarize videos with telegram bot using large language models
Explore the docs »
View Demo
·
Report Bug
·
Request Feature
Table of Contents
This is a LLM-powered video summarizer which uses telegram bot frontend to communicate with user and whisper neural network to transcribe text. It uses Aiogram as bot framework, Langchain to communicate with language models, OpenAI to inference text in cloud and Llama.cpp for local inference.
- Inputs youtube videos as well as videofiles
- Has tweakable settings for each user
- Adapts answer language for each user
- Outputs text as Markdown or PDF
- Supports local LLMs with llama.cpp
To get it installed on your system and running follow these simple steps:
Python
3.9-3.10
Managing python installations
Installing specific python version using pyenv
on Linux:
cd video_summarizer_bot
pyenv local 3.10.11
- Telegram access token from BotFather
- Access to
MongoDB
server either remotely or locally
How to install mongodb community server OpenAI
API key if you want to use cloud LLMs likeGPT4
- CUDA, ROMc or MSP if you want to use local LLMs
- Clone and cd to the repo:
git clone https://github.com/dvarkless/video_summarizer_bot.git
cd video_summarizer_bot
- Run the installer script:
chmod +x ./scripts/installer_linux.sh
./scripts/installer_linux.sh
Use manual intallation if you are using Windows system.
- Clone and cd to the repo:
git clone https://github.com/dvarkless/video_summarizer_bot.git
cd video_summarizer_bot
- Activate virtual environment:
# pyenv local 3.10.11
python -m venv venv
source venv/bin/activate
on Windows:
py -3.10 -m venv venv
venv/Scripts/activate.bat
- Install general dependencies:
pip install -r requirements.txt
- (Optional) Install llamacpp:
pip uninstall llama-cpp-python -y
# Uncomment which acceleration do you want to use:
# If you have Nvidia GPU:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --no-cache-dir
# If you have AMD GPU:
# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python --no-cache-dir
Visit llama.cpp repo for more info
5. (Optional) Install faster-whisper if you have CUDA
pip uninstall torch torchaudio
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118 --no-cache-dir
pip install nvidia-cublas-cu11 nvidia-cudnn-cu11
pip install faster-whisper
5.1 (Optional) OR install whisper
pip install openai-whisper
This bot is designed to be customizable using only the configuration files. You can change this YAML files at ./configs/
.
Before you can start, you should edit secrets.yml
file
- Configuring bot:
Please refer to model docs to configure models.
If you want to write bot's responses in different language, refer to the documentation.
Bot settings.
How to tweak LLM behaviour.
To use this bot, run:
./start_bot.sh
OR
./start_bot.bat
OR Run with minimal setup manually:
- Activate mongodb service
systemctl start mongodb.service
- Run:
source venv/bin/activate
# Find cudnn and cublas for faster-whisper
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
export PYTHONPATH=$(pwd)
python src/bot/bot.py
List of bot commands:
/start
- Starts bot
/help
- Print help message
/change_language
- Change bot's language
/document_format
- Change output document format such as Markdown or PDF
/document_language
- Specify document language
/text_format
- Tweak output text composition
- Add tests
- Add pdf document composition
- Edit prompts to get better results
- Question answering
Distributed under the MIT License. See LICENSE.txt
for more information.
This project is possible thanks to this awesome open-source libraries:
- Langchain
- Aiogram
- Whisper and Faster-whisper
- llama.cpp and its python port
- And many others