
Streamlit Voice Wizard es una herramienta versátil que simplifica el proceso de transcripción, traducción y procesamiento de texto para archivos de audio y videos de YouTube

Primary LanguagePython

Streamlit Voice Wizard: Youtube2Text, Transcription, Translation, and Audio Processing

Streamlit Voice Wizard is an all-in-one application that transforms YouTube videos into actionable text insights through a streamlined process of downloading, transcribing, translating, and performing Natural Language Processing (NLP).

youtube2audio + Audio2text + Translate Spanish + PLN + Streamlit

Overview Image

Application Flow:

  1. Download: Retrieve video content from YouTube.
  2. Video-to-Audio: Convert video to audio format for processing.
  3. Audio-to-Text: Transcribe the audio content into text.
  4. Translate: Translate the text into Spanish or other languages.
  5. NLP Analysis: Perform sentiment analysis and entity recognition.
  6. Visualization: Display the processed data through an interactive Streamlit interface.


Key Features

  • Transcription: Quick and accurate conversion of video and audio content to text using AI models.
  • Translation: Seamless translation of transcriptions to enhance understanding across languages.
  • Sentiment Analysis: Gauge the emotional tone behind the text.
  • Entity Recognition: Identify key entities within the text to extract meaningful insights.
  • Streamlit UI: A user-friendly interface that provides a comprehensive view of the multimedia content.

Getting Started


  • Environment: Python 3.x and pip installed.
  • Dependencies: Install all necessary libraries from requirements.txt.
  • API Keys: Secure Huggingface and Youtube V3 API keys included in the code.


  1. Clone the repository or download the source code.
  2. Navigate to the project directory in the terminal.
  3. Install dependencies:
  4. Add API keys to the configuration file or environment variables as required.

Usage Guide

  1. Start the application by running the command in the terminal:
  2. Interface operations:
  • Upload an audio file (.wav) or paste a YouTube video URL.
  • Select the transcription method: API or Whisper Model.
  • Click "Transcribe" to initiate the transcription and subsequent translation process.
  1. View Results:
  • Original transcription, translation, and analyses will be dynamically generated and displayed.
  • Audio playback is available for uploaded files.


Future Enhancements

  • Language Options: Incorporate multi-language support for translation.
  • Configuration Management: Streamline API configuration through environment variables.
  • NLP Support: Extend support for languages like Arabic within NLP analysis.
  • Usability Improvements: Address UI bugs and enhance file management features.
  • Analytics Expansion: Augment channel analytics and integrate advanced search capabilities.


This project is under the MIT License.