naymaraq's Stars
YerevaNN/ChemLactica
Fine-tuning Galactica and Gemma to operate on SMILES. Integrates into a molecular optimization algorithm.
lhotse-speech/lhotse
Tools for handling speech data in machine learning projects.
jonmay/ASTRAPOP-yer24
[Yerevan 24] Authorship Style Transfer with Policy Optimization
Picsart-AI-Research/StreamingT2V
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
tango4j/llm_speaker_tagging
SLT 2024 Challenge: Post-ASR-Speaker-Tagging
huggingface/open_asr_leaderboard
NVIDIA/NeMo-speech-data-processor
A toolkit for processing speech data and creating speech datasets
meta-llama/llama3
The official Meta Llama 3 GitHub site
jasonppy/VoiceCraft
Zero-Shot Speech Editing and Text-to-Speech in the Wild
lucidrains/ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
Vaibhavs10/insanely-fast-whisper
BUTSpeechFIT/DiaPer
dmlguq456/NeXt_TDNN_ASV
Official repository of NeXt-TDNN for speaker verification
huggingface/distil-whisper
Distilled variant of Whisper for speech recognition. 6x faster, 50% smaller, within 1% word error rate.
deep-privacy/x-vector-procrustes
Supervised/Unsupervised Alignment of Clear/Anonymized X-Vector with Procrustes/Wasserstein Procrustes
desh2608/dover-lap
Python package for combining diarization system outputs.
unum-cloud/usearch
Fast Open-Source Search & Clustering engine × for Vectors & 🔜 Strings × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
NVIDIA/NeMo-text-processing
NeMo text processing for ASR and TTS
Wadaboa/titanet
Speaker identification/verification models for Machine Learning for Computer Vision class at UNIBO
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
burchim/EfficientConformer
[ASRU 2021] Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition
microsoft/onnxruntime
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
chenfei-wu/TaskMatrix
mlcommons/training
Reference implementations of MLPerf™ training benchmarks
flashlight/text
Text utilities, including beam search decoding, tokenizing, and more, built for use in Flashlight.
parlance/ctcdecode
PyTorch CTC Decoder bindings
kensho-technologies/pyctcdecode
A fast and lightweight python-based CTC beam search decoder for speech recognition.
NVIDIA/NeMo
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
google-deepmind/alphatensor
openai/whisper
Robust Speech Recognition via Large-Scale Weak Supervision