/ViDove

🐦ViDove: End-to-end Video Translation Toolkit

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Contributors Forks Stargazers Issues GPL-3.0 License


Logo

🐦ViDove: End-to-end Video Translation Toolkit

Transcribe and Translate Your Video with a Single Click
Offical Website »

Try Demo · Report Bug · Request Feature

Table of Contents
  1. Release
  2. About The Project
  3. Getting Started
  4. Usage
  5. Contributing
  6. License
  7. Contact

Release

  • [12/20]🔥ViDove V0.1 Released: We are happy to release our initial version of ViDove: End-to-end Video Translation Toolkit.

(back to top)

About The Project

Introducing ViDove, a pioneering video automated machine translation toolkit, meticulously crafted for professional domains. Developed by Pigeon.AI, ViDove promises rapid, precise, and relatable translations, revolutionizing the workflow of subtitle groups and translation professionals. It's an open-source tool, offering unparalleled flexibility, transparency, and security, alongside scalable architecture for customization. Featuring domain adaptation, ViDove effortlessly adjusts to various professional fields, and its end-to-end pipeline turns video links into captioned content with a single click. ViDove is not just a translation tool; it's a bridge connecting content across languages, making video translation more accessible, efficient, and accurate than ever.

Here's why:

  • End-to-End Pipeline (from video link to captioned video):
    • One-Click Deployment: Users can deploy the tool with just one click.
    • Video Link to Translated Video: Simply input a video link to generate a translated video with ease.
  • Domain Adaptation:
    • Our pipeline is adaptable to various professional fields (e.g., StarCraft II). Users can easily upload customized dictionaries and fine-tune models based on specific data corpora.
  • Open Source:
    • Our toolkit is entirely open source, and we warmly welcome and look forward to the participation of the broader developer community in the ongoing development of the toolkit.

(back to top)

Main Contributors

Web Dev: Tingyu Su

Getting Started

Installation

  1. Get a OpenAI API Key at https://platform.openai.com/api-keys
  2. Clone the repo
    git clone https://github.com/project-kxkg/ViDove.git
    cd ViDove
  3. Install Requirments
    conda create -n ViDove python=3.10 -y
    conda activate ViDove
    pip install --upgrade pip
    pip install -r requirements.txt
  4. Enter your API in bash
    export OPENAI_API_KEY="your_api_key" 

(back to top)

Usage

Quick Start with Gradio User Interface

python3 entries/app.py

Launch with configs

  • Start with Youtube Link input:
    python3 entries/run.py --link "your_youtube_link"
  • Start with Video input:
    python3 entries/run.py --video_file path/to/video_file
  • Start with Audio input:
    python3 entries/run.py --audio_file path/to/audio_file
  • Terminal Usage:
    usage: run.py [-h] [--link LINK] [--video_file VIDEO_FILE] [--audio_file AUDIO_FILE] [--launch_cfg LAUNCH_CFG] [--task_cfg TASK_CFG]
    
    options:
      -h, --help            show this help message and exit
      --link LINK           youtube video link here
      --video_file VIDEO_FILE
                            local video path here
      --audio_file AUDIO_FILE
                            local audio path here
      --launch_cfg LAUNCH_CFG
                            launch config path
      --task_cfg TASK_CFG   task config path

Configs

Use "--launch_cfg" and "--task_cfg" in run.py to change launch or task configuration

  • configs/local_launch.yaml

    # launch config for local environment
    local_dump: ./local_dump # change local dump dir here
    environ: local
  • configs/task_config.yaml

    copy and change this config for different configuration

    # configuration for each task
    source_lang: EN
    target_lang: ZH
    field: General
    
    # ASR config
    ASR:
      ASR_model: whisper
      whisper_config:
        whisper_model: tiny
        method: stable
      
    # pre-process module config
    pre_process: 
      sentence_form: True
      spell_check: False
      term_correct: True
    
    # Translation module config
    translation:
      model: gpt-4
      chunk_size: 1000
    
    # post-process module config
    post_process: 
      check_len_and_split: True
      remove_trans_punctuation: True
    
    # output type that user receive
    output_type: 
      subtitle: srt
      video: True
      bilingual: True

(back to top)

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

(back to top)

License

Distributed under the GPL-3.0 license. See LICENSE for more information.

(back to top)

Contact

Developed by Pigeon.AI🐦 from Star Pigeon Fan-sub Group.

See Our Bilibili Account

Official Email: gggzmz@163.com

(back to top)