vivekgoquest's Stars
descriptinc/melgan-neurips
GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis
Curated-Awesome-Lists/awesome-ai-music-generation
A curated compilation of AI-driven generative music resources and projects. Explore the blend of machine learning algorithms and musical creativity.
adefossez/demucs
Code for the paper Hybrid Spectrogram and Waveform Source Separation
IAHispano/Applio
A simple, high-quality voice conversion tool focused on ease of use and performance
aris-ai/Audio-and-text-based-emotion-recognition
A multimodal approach on emotion recognition using audio and text.
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Shahabks/my-voice-analysis
My-Voice Analysis is a Python library for the analysis of voice (simultaneous speech, high entropy) without the need of a transcription. It breaks utterances and detects syllable boundaries, fundamental frequency contours, and formants.
Majdoddin/nlp
ancs21/awesome-openai-whisper
A curated list of awesome OpenAI's Whisper
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
mshumer/ai-researcher
open-mmlab/mmtracking
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
yuzhms/Streaming-Video-Model
[CVPR2023] Code for "Streaming Video Model"
mu4farooqi/whisperX
WhisperX: Automatic Speech Recognition with Accurate Word-level Timestamps.
MahmoudAshraf97/whisper-diarization
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
jarredou/MVSEP-MDX23-Colab_v2
Colab adaptation of MVSep Model for MDX23 music separation contest
nrl-ai/pautobot
🔥 Your private task assistant with GPT 🔥 - Ask questions about your documents.
justinjohn0306/Wav2Lip
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.
justinjohn0306/video-retalking
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Human-Lambdas/human-lambdas
Open Source Human in the Loop platform for anyone to run their own private Mechanical Turk.
crowd-sh/crowd-sh
Mechanical Turk for Airtable