audio-to-text

There are 61 repositories under audio-to-text topic.

Audio-To-Text
The "Audio to Text Transcription with AssemblyAI and Streamlit" project is a web application that allows users to upload audio files and convert them into text using the AssemblyAI API.
Language:Python1
Speech-to-text-Realtime-with-extension
"Speech-to-Text Realtime with Extension" is a browser extension that converts speech to text in real-time. It supports multiple languages, making it ideal for note-taking, customer service, and accessibility. Easy to install and use on popular browsers.
Language:Jupyter Notebook1
LOKAL_for_Kafka
Event-driven AI > A Python-Kafka event-driven micro-services solution for distributed audio transcriptions.
Language:Python1
Whisper-Subtitle-Generator
The Whisper Subtitle Generator leverages OpenAI's Whisper model to generate subtitles from audio and video files. This Python-based tool supports multiple languages and employs advanced audio processing techniques to ensure high accuracy in transcription.
Language:Python1
audio-to-text
There is simple backend project to use whisper-rs.
Language:Rust1
TranscribeTool
📼 A streamlit web interface designed to extract words from video/audio files into text • Python, FFmpeg, Whisper, YT-DLP
Language:Python1
LOKAL_transcriptions
Edge AI > AI app to easily perform transcriptions on regular computers. Quality on par with on-cloud alternatives. Lower costs. Reduced privacy risks.
Language:Python1
sml-lab2-2023-manfredi-meneghin
Scalable Machine Learning and Deep Learning, Lab2, 2023
Language:Jupyter Notebook1
vialect
Streamline your video/audio intake by transforming multimedia content into navigable collections of transcribed text and summaries!
Language:Python1
extract-text-from-image-and-audio-using-google-vision-api
I have used the Google Cloud Vision API to transcript the audio file and extract the text from the image.
Language:HTML1
Transcribe-Reels
Instagram Reels Transcription App is a web-based application built using Streamlit that allows users to transcribe Instagram Reels into text using the AssemblyAI API. The app downloads Instagram Reels, converts them into audio, and transcribes the audio with speaker labels and timestamps.
Language:Python
SpeechToText
Speech-to-Text using OpenAI's Whisper model
Language:CSS
whisper-transcriber
Use OpenAI's Whisper to transcribe audio files and diariaze speakers of the transcribed text
Language:Python
Meeting-Notes
Transcribe Bangla Audio into Text
Language:Jupyter Notebook
whisper_model_evaluator
WER, MER, WIL of Whisper vs Vosk vs Google transcribators comparator
Language:Jupyter Notebook
whisper-timestamped
Timestamped ASR microservice
Language:Jupyter Notebook
whisper-large-v3
Whisper Large V3 is a pre-trained model developed by OpenAI and designed for tasks like automatic speech recognition (ASR), speech translation and language identification.
Language:Python
openai-whisper-large-v2
Whisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. In this template, we will import the Whisper model on Inferless Platform.
Language:Python
TranscriptGen
TranscriptGen is an application for transcribing audio and video files. Transcription output is .txt or .srt. Most audio and video formats supported (with ffmpeg).
Language:Python
audiotext
AudioTextPro: Convert audio to text accurately in real-time using our advanced AI speech recognition technology. 🐍
Language:Python
Easy-PaddleSpeech-Audio-Text-Converter
inter-convert between audio & text, easy to use with GUI desktop application by PaddleSpeech and PySide6.
Language:Python
AwsTranscribeLambdaFunction
AWS Lambda Function which creates a transcribe job, that reads mp3 file and converts it into text format in a json file.
Language:Python
lectureNoteAssistant
A Windows desktop application that can generate subtitles, translations, and summaries for videos in 8 languages using API and SDK from Tencent, Alibaba, and Baidu. You can use it for generating bilingual transcripts for videos and summarising the key points from the transcript using LexRank.
Language:Go
react_app_collage
This application contains "Audio to text", "Dictation" and "Gender prediction" modules in it.
Language:SCSS
Audio-to-Text
Web app for transcribing audio file (.wav format) to text usingGoogle Cloud Speech API.
Language:HTML