This project implements a web application where users can upload text or audio files through a HTML/JavaScript interface. The uploaded files are then processed using Flask, Whisper X for transcription and diarization, and OpenAI API for sentiment analysis. The processed results are then returned to the user via the web interface.
- User-friendly Interface: Provides a simple and intuitive interface for users to upload files.
- Text and Audio Support: Supports both text and audio file formats for processing.
- Automated Processing: Utilizes Whisper X for transcription and diarization, and OpenAI API for sentiment analysis.
- Result Visualization: Displays the processed results back to the user in a clear and understandable format.
- GPU Acceleration: Utilizes 1 NVIDIA T4 GPU for faster processing.
- Dockerized Deployment: The application is deployed using a Docker image on the Google Cloud Platform (GCP).
- Flask: Backend server framework for handling file uploads and processing.
- Whisper X: Tool for transcription and diarization of audio files.
- OpenAI API: API for sending requests to GPT-4 LLM.
- HTML/JavaScript: Frontend technologies for building the user interface.
- Docker: Containerization technology for packaging the application.
- Google Cloud Platform (GCP): Cloud platform for deployment.
- NVIDIA T4 GPUs: GPU hardware for acceleration.
- Github clone
- Huggingface
- OpenAI
-
Clone this repository:
git clone https://github.com/YashChopda/SentimentSpeakerAnalyser.git
-
Install dependencies:
pip install -r requirements.txt pip install git+https://github.com/m-bain/whisperx.git
-
Obtain API keys for Whisper X and OpenAI. Place these keys in the appropriate configuration file.
-
Run the Flask application:
python app.py
Alternatively, you can choose to build a Docker container using the Dockerfile provided and run it to access the API.
- The flask server was containerized using Docker.
- The Docker Image was uploaded to Artifact Registry.
- A GPU VM instance with CUDA was created.
- Docker and it's dependencies were installed inside GPU VM instance.
- The Docker Image was pulled from Artifact Registry.
- The Docker Image was served via the GPU VM instance.
- Access the web application in your browser. Follow the given steps:
- Open index.html with any web browser(tested with chrome and safari). An interface will open.
- Select and Upload a text or audio file using the provided interface.
- Wait for the processing to complete.
- View the processed sentiment analysis.
- To run it locally I used CPU, which made the processing quite slow, as the model training for first instance took longer.
- The Instruction and prompt template needed various trials as the output was not consistent each time.
- During the docker build for ease of deployment, multiple dependency errors especially for WhisperX (like installing ffmpeg) were faced.
- Deploying the Docker Image on GPU instance was also difficult.
- Whisper X Team: For providing the transcription and diarization capabilities.
- OpenAI Team: For providing the sentiment analysis API.
- Google Cloud Platform: For providing the cloud infrastructure for deployment.
For questions or feedback, please contact yash.chopda7@gmail.com.