xanguera's Stars
tencent-ailab/persona-hub
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
VikParuchuri/surya
OCR, layout analysis, reading order, line detection in 90+ languages
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
WangHelin1997/SpeechTasks
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
MycroftAI/mycroft-core
Mycroft Core, the Mycroft Artificial Intelligence platform.
curlconverter/curlconverter
Transpile curl commands into Python, JavaScript and 27 other languages
sbs80/cnn-audio-classification
Audio classification with PyTorch using a convolutional neural network trained on the UrbanSound8K data set.
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
google-research/google-research
Google Research
schmiph2/pysepm
Python implementation of performance metrics in Loizou's Speech Enhancement book
joouha/euporie
Jupyter notebooks in the terminal
felixkreuk/UnsupSeg
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)
jsvine/pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
revdotcom/fstalign
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.
yistLin/FragmentVC
Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention
idiap/apam
APAM toolkit is built on PyTorch and provides recipes to adapt pretrained acoustic models with a variety of sequence discriminative training criterions.
nlathia/kettle-cli
🎯 kettle is a CLI tool for creating and deploying cloud functions & docker containers for machine learning
arpankg/ctci-python-solutions
Cracking the Coding Interview in Python 3. The solutions all have detailed explanations with visuals.
idiap/pkwrap
A pytorch wrapper for LF-MMI training and parallel training in Kaldi
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
KarelVesely84/kaldi-io-for-python
Python functions for reading kaldi data formats. Useful for rapid prototyping with python.
perfall/Edyson
Flask-based web framework for visualisation and explorative listening of audio.
kamperh/recipe_bucktsong_awe
Unsupervised acoustic word embeddings evaluated on Buckeye English and NCHLT Xitsonga data in Python 2.7.
jaanus/voicebot
A simple web-based voice bot against DialogFlow (former api.ai) backend.
josepatino/pyBK
Speaker diarization python system based on binary key speaker modelling
xanguera/BeamformIt
BeamformIt acoustic beamforming software
flashlight/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
tvandame/back-end-developer-interview-questions
A list of helpful back-end related questions you can use to interview potential candidates. Inspired by the git-repo https://github.com/darcyclarke/Front-end-Developer-Interview-Questions.git
dawsonice/KissProxy
NIO based android http&https local proxy.