/video_summarizer_bot

Transform any video into a document format using LLM chains. Work in progress.

Primary LanguagePythonMIT LicenseMIT

Issues MIT License LinkedIn Gmail


Logo

AI Video Summarizer TG Bot

Summarize videos with telegram bot using large language models
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents

Readme на русском

About The Project

This is a LLM-powered video summarizer which uses telegram bot frontend to communicate with user and whisper neural network to transcribe text. It uses Aiogram as bot framework, Langchain to communicate with language models, OpenAI to inference text in cloud and Llama.cpp for local inference.

Features:

  • Inputs youtube videos as well as videofiles
  • Has tweakable settings for each user
  • Adapts answer language for each user
  • Outputs text as Markdown or PDF
  • Supports local LLMs with llama.cpp

(back to top)

Getting Started

To get it installed on your system and running follow these simple steps:

Prerequisites

  • Python 3.9-3.10
Managing python installations

Linux/MacOS:

Installing specific python version using pyenv on Linux:

cd video_summarizer_bot
pyenv local 3.10.11

Pyenv installation

Windows:

How to run multiple Python versions on Windows

Installation

  1. Clone and cd to the repo:
git clone https://github.com/dvarkless/video_summarizer_bot.git
cd video_summarizer_bot
  1. Run the installer script:
chmod +x ./scripts/installer_linux.sh
./scripts/installer_linux.sh

Use manual intallation if you are using Windows system.

Manual installation:

  1. Clone and cd to the repo:
git clone https://github.com/dvarkless/video_summarizer_bot.git
cd video_summarizer_bot
  1. Activate virtual environment:
# pyenv local 3.10.11
python -m venv venv
source venv/bin/activate

on Windows:

py -3.10 -m venv venv
venv/Scripts/activate.bat
  1. Install general dependencies:
pip install -r requirements.txt
  1. (Optional) Install llamacpp:
pip uninstall llama-cpp-python -y
# Uncomment which acceleration do you want to use:
# If you have Nvidia GPU:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python --no-cache-dir

# If you have AMD GPU:
# CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python --no-cache-dir

Visit llama.cpp repo for more info
5. (Optional) Install faster-whisper if you have CUDA

pip uninstall torch torchaudio
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu118 --no-cache-dir
pip install nvidia-cublas-cu11 nvidia-cudnn-cu11
pip install faster-whisper

5.1 (Optional) OR install whisper

pip install openai-whisper

(back to top)

Configuration:

This bot is designed to be customizable using only the configuration files. You can change this YAML files at ./configs/.

Before you can start, you should edit secrets.yml file

  1. Configuring bot:
    Please refer to model docs to configure models.
    If you want to write bot's responses in different language, refer to the documentation.
    Bot settings.
    How to tweak LLM behaviour.

Usage

To use this bot, run:

./start_bot.sh

OR

./start_bot.bat

OR Run with minimal setup manually:

  1. Activate mongodb service systemctl start mongodb.service
  2. Run:
source venv/bin/activate
# Find cudnn and cublas for faster-whisper
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
export PYTHONPATH=$(pwd)
python src/bot/bot.py

List of bot commands:
/start - Starts bot
/help - Print help message
/change_language - Change bot's language
/document_format - Change output document format such as Markdown or PDF
/document_language - Specify document language
/text_format - Tweak output text composition

To-Do

  • Add tests
  • Add pdf document composition
  • Edit prompts to get better results
  • Question answering

(back to top)

License

Distributed under the MIT License. See LICENSE.txt for more information.

(back to top)

Acknowledgments

This project is possible thanks to this awesome open-source libraries:

(back to top)