This repository contains a Real-Time Speech-to-Text application built with Next.js for the frontend and Python Flask for the backend. The system captures audio input from the user and transcribes it into text in real-time using speech recognition technologies.
The installation steps are only allowed in ubuntu machine.
- Clone the application.
git clone https://github.com/harshadladva97/RealTimeSpeechToText.git
cd RealTimeSpeechToText
- Install flask in the python.
For ubuntu:
pip install virtualenv
virtualenv venv
source ./venv/bin/activate
pip install flask flask-cors eventlet SpeechRecognition pyaudio # Install all the required packages.
- Run the backend using following command:
python3 app.py
- Go to the
frontend
folder in the application.
cd frontend`
- Install npm packages.
pnpm install
- Run the frontend application.
pnpm run dev
- Open the browser with
http://127.0.0.1:3000
.
- The application use the microphone to record the speech and transcribing the recording into the text.
- It will transcribing text in realtime. (Note: If you don't catch the text then please check your microphone, background noise and your words should be very clear for understanding)
-
Press on "Start Recording" Button.
-
Once it will start recoding, please start speaking in microphone.
-
To stop the recording. Please press on the "Stop Recording" button.