harryjulian's Stars
neonbjb/tts-scores
Scripts for computing the Intelligibility and CLVP scores for evaluating TTS models
jbloomAus/SAELens
Training Sparse Autoencoders on Language Models
EleutherAI/mdl
Minimum Description Length probing for neural network representations
ARBML/tnkeeh
Arabic cleaning, normalization and segmentation library.
Sg4Dylan/libvits-ncnn
libvits-ncnn is an ncnn implementation of the VITS library that enables cross-platform GPU-accelerated speech synthesis.🎙️💻
liutaocode/TTS-arxiv-daily
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
xingchensong/S3Tokenizer
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
Orange-OpenSource/Cool-Chic
Low-complexity neural image & video codec.
soerenab/AudioMNIST
facebookresearch/flow_matching
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
neuphonic/pyneuphonic
Python SDK for the Neuphonic TTS engine.
ElanaPearl/InterPLM
Discovering Interpretable Features in Protein Language Models via Sparse Autoencoders
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
jrgillick/laughter-detection
olsdavis/fisher-flow
Official implementation of Fisher-Flow Matching (NeurIPS 2024).
ccr-cheng/statistical-flow-matching
Official implementation of the NeurIPS 24 paper of statistical flow matching (SFM) for discrete generation.
Jhomanik/Optimal-Flow-Matching
The official repository for the paper "Optimal Flow Matching: Learning Straight Trajectories in Just One Step" (NeurIPS 2024)
kohei0209/self-remixing
Official implementation of Self-Remixing
zju-pi/diff-sampler
An open-source toolbox for fast sampling of diffusion models. Official implementations of our works published in ICML, NeurIPS, CVPR.
ml-jku/HopCPT
Conformal Prediction for Time Series with Modern Hopfield Networks
google/XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
Choddeok/EmoSpherepp
The official implementation of EmoSphere++
juanmc2005/diart
A python package to build AI-powered real-time audio applications
halsay/ASR-TTS-paper-daily
Update ASR paper everyday
google/speaker-id
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
bytedance/uss
This is the PyTorch implementation of the Universal Source Separation with Weakly labelled Data.
marl/crepe
CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)
alibabasglab/MossFormer2
This is the audio sample repository for speech separation model "MossFormer2".
snakers4/silero-vad
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
revdotcom/fstalign
An efficient OpenFST-based tool for calculating WER and aligning two transcript sequences.