Streamlit Voice Wizard is an all-in-one application that transforms YouTube videos into actionable text insights through a streamlined process of downloading, transcribing, translating, and performing Natural Language Processing (NLP).
- Download: Retrieve video content from YouTube.
- Video-to-Audio: Convert video to audio format for processing.
- Audio-to-Text: Transcribe the audio content into text.
- Translate: Translate the text into Spanish or other languages.
- NLP Analysis: Perform sentiment analysis and entity recognition.
- Visualization: Display the processed data through an interactive Streamlit interface.
- Transcription: Quick and accurate conversion of video and audio content to text using AI models.
- Translation: Seamless translation of transcriptions to enhance understanding across languages.
- Sentiment Analysis: Gauge the emotional tone behind the text.
- Entity Recognition: Identify key entities within the text to extract meaningful insights.
- Streamlit UI: A user-friendly interface that provides a comprehensive view of the multimedia content.
- Environment: Python 3.x and pip installed.
- Dependencies: Install all necessary libraries from
requirements.txt
. - API Keys: Secure Huggingface and Youtube V3 API keys included in the code.
- Clone the repository or download the source code.
- Navigate to the project directory in the terminal.
- Install dependencies:
- Add API keys to the configuration file or environment variables as required.
- Start the application by running the command in the terminal:
- Interface operations:
- Upload an audio file (.wav) or paste a YouTube video URL.
- Select the transcription method: API or Whisper Model.
- Click "Transcribe" to initiate the transcription and subsequent translation process.
- View Results:
- Original transcription, translation, and analyses will be dynamically generated and displayed.
- Audio playback is available for uploaded files.
- Language Options: Incorporate multi-language support for translation.
- Configuration Management: Streamline API configuration through environment variables.
- NLP Support: Extend support for languages like Arabic within NLP analysis.
- Usability Improvements: Address UI bugs and enhance file management features.
- Analytics Expansion: Augment channel analytics and integrate advanced search capabilities.
This project is under the MIT License.