xanguera

xanguera's Stars

tencent-ailab/persona-hub
Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"
Language:Python80759
VikParuchuri/surya
OCR, layout analysis, reading order, line detection in 90+ languages
Language:Python10k650
lucasnewman/best-rq-pytorch
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
Language:Python848
WangHelin1997/SpeechTasks
This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.
726
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python1.3k99
MycroftAI/mycroft-core
Mycroft Core, the Mycroft Artificial Intelligence platform.
Language:Python6.5k1.3k
curlconverter/curlconverter
Transpile curl commands into Python, JavaScript and 27 other languages
Language:TypeScript7.5k917
sbs80/cnn-audio-classification
Audio classification with PyTorch using a convolutional neural network trained on the UrbanSound8K data set.
Language:Python8
k2-fsa/k2
FSA/FST algorithms, differentiable, with PyTorch compatibility.
Language:Cuda1.1k213
google-research/google-research
Google Research
Language:Jupyter Notebook34k7.8k
schmiph2/pysepm
Python implementation of performance metrics in Loizou's Speech Enhancement book
Language:Python37986
joouha/euporie
Jupyter notebooks in the terminal
Language:Python1.6k38
felixkreuk/UnsupSeg
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation (INTERSPEECH 2020)
Language:Python13529
jsvine/pdfplumber
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
Language:Python6.5k657
revdotcom/fstalign
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.
Language:C++1457
yistLin/FragmentVC
Any-to-any voice conversion by end-to-end extracting and fusing fine-grained voice fragments with attention
Language:Python19738
idiap/apam
APAM toolkit is built on PyTorch and provides recipes to adapt pretrained acoustic models with a variety of sequence discriminative training criterions.
Language:Python141
nlathia/kettle-cli
🎯 kettle is a CLI tool for creating and deploying cloud functions & docker containers for machine learning
Language:Go321
arpankg/ctci-python-solutions
Cracking the Coding Interview in Python 3. The solutions all have detailed explanations with visuals.
1.1k159
idiap/pkwrap
A pytorch wrapper for LF-MMI training and parallel training in Kaldi
Language:Python7312
coqui-ai/open-speech-corpora
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
1.3k137
KarelVesely84/kaldi-io-for-python
Python functions for reading kaldi data formats. Useful for rapid prototyping with python.
Language:Python376119
perfall/Edyson
Flask-based web framework for visualisation and explorative listening of audio.
Language:JavaScript519
kamperh/recipe_bucktsong_awe
Unsupervised acoustic word embeddings evaluated on Buckeye English and NCHLT Xitsonga data in Python 2.7.
Language:Jupyter Notebook62
jaanus/voicebot
A simple web-based voice bot against DialogFlow (former api.ai) backend.
Language:JavaScript3934
josepatino/pyBK
Speaker diarization python system based on binary key speaker modelling
Language:Python6111
xanguera/BeamformIt
BeamformIt acoustic beamforming software
Language:C++347111
flashlight/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
Language:C++6.4k1k
tvandame/back-end-developer-interview-questions
A list of helpful back-end related questions you can use to interview potential candidates. Inspired by the git-repo https://github.com/darcyclarke/Front-end-Developer-Interview-Questions.git
897149
dawsonice/KissProxy
NIO based android http&https local proxy.
Language:Java12235