video_summarizer_bot: A Python repository from dvarkless

AI Video Summarizer TG Bot

Summarize videos with telegram bot using large language models
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents

About The Project
Getting Started
Prerequisites
Installation
Usage
To-do
License
Acknowledgments

Readme на русском

About The Project

This is a LLM-powered video summarizer which uses telegram bot frontend to communicate with user and whisper neural network to transcribe text. It uses Aiogram as bot framework, Langchain to communicate with language models, OpenAI to inference text in cloud and Llama.cpp for local inference.

Features:

Inputs youtube videos as well as videofiles
Has tweakable settings for each user
Adapts answer language for each user
Outputs text as Markdown or PDF
Supports local LLMs with llama.cpp

(back to top)

Getting Started

To get it installed on your system and running follow these simple steps:

Prerequisites

Python 3.9-3.10

Managing python installations

Linux/MacOS:

Installing specific python version using pyenv on Linux:

cd video_summarizer_bot
pyenv local 3.10.11

Pyenv installation

Windows:

How to run multiple Python versions on Windows

Telegram access token from BotFather
Access to MongoDB server either remotely or locally
How to install mongodb community server
OpenAI API key if you want to use cloud LLMs like GPT4
CUDA, ROMc or MSP if you want to use local LLMs

Installation

Clone and cd to the repo:

git clone https://github.com/dvarkless/video_summarizer_bot.git
cd video_summarizer_bot

Run the installer script:

chmod +x ./scripts/installer_linux.sh
./scripts/installer_linux.sh

Use manual intallation if you are using Windows system.

Manual installation:

Clone and cd to the repo:

git clone https://github.com/dvarkless/video_summarizer_bot.git
cd video_summarizer_bot

Activate virtual environment:

# pyenv local 3.10.11
python -m venv venv
source venv/bin/activate

on Windows:

py -3.10 -m venv venv
venv/Scripts/activate.bat

Install general dependencies:

pip install -r requirements.txt

(Optional) Install llamacpp:

pip uninstall llama-cpp-python -y
# Uncomment which acceleration do you want to use:
# If you have Nvidia GPU:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --no-cache-dir

# If you have AMD GPU:
# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python --no-cache-dir

Visit llama.cpp repo for more info
5. (Optional) Install faster-whisper if you have CUDA

pip uninstall torch torchaudio
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118 --no-cache-dir
pip install nvidia-cublas-cu11 nvidia-cudnn-cu11
pip install faster-whisper

5.1 (Optional) OR install whisper

pip install openai-whisper

(back to top)

Configuration:

This bot is designed to be customizable using only the configuration files. You can change this YAML files at ./configs/.

Before you can start, you should edit secrets.yml file

Configuring bot:
Please refer to model docs to configure models.
If you want to write bot's responses in different language, refer to the documentation.
Bot settings.
How to tweak LLM behaviour.

Usage

To use this bot, run:

./start_bot.sh

./start_bot.bat

OR Run with minimal setup manually:

Activate mongodb service systemctl start mongodb.service
Run:

source venv/bin/activate
# Find cudnn and cublas for faster-whisper
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
export PYTHONPATH=$(pwd)
python src/bot/bot.py

List of bot commands:
/start - Starts bot
/help - Print help message
/change_language - Change bot's language
/document_format - Change output document format such as Markdown or PDF
/document_language - Specify document language
/text_format - Tweak output text composition

To-Do

Add tests
Add pdf document composition
Edit prompts to get better results
Question answering

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Acknowledgments

This project is possible thanks to this awesome open-source libraries:

(back to top)