wsstriving

Shanghai Jiao Tong University

wsstriving's Stars

ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C34.8k 312 1.3k3.5k
tatsu-lab/stanford_alpaca
Code and documentation to train Stanford's Alpaca models, and generate the data.
Language:Python29.4k 339 2684k
mli/paper-reading
深度学习经典、新论文逐段精读
26.5k 725 02.4k
Dao-AILab/flash-attention
Fast and memory-efficient exact attention
Language:Python13.6k 115 1k1.2k
m-bain/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Language:Python11.7k 135 6931.2k
modelscope/FunASR
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Language:Python6.2k 58 1.1k658
facebookresearch/encodec
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
Language:Python3.4k 57 70305
spotify/basic-pitch
A lightweight yet powerful audio-to-MIDI converter with pitch bend detection
Language:Python3.4k 49 77261
openai/improved-diffusion
Release for Improved Denoising Diffusion Probabilistic Models
Language:Python3.2k 123 133480
lucidrains/musiclm-pytorch
Implementation of MusicLM, Google's new SOTA model for music generation using attention networks, in Pytorch
Language:Python3.1k 99 53254
microsoft/torchscale
Foundation Architecture for (M)LLMs
Language:Python3k 46 77202
enhuiz/vall-e
An unofficial PyTorch implementation of the audio LM VALL-E
Language:Python2.9k 87 97415
haoheliu/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language:Python2.4k 42 107221
lucidrains/audiolm-pytorch
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
Language:Python2.4k 60 170255
lucidrains/lion-pytorch
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
Language:Python2k 15 2349
WenzheLiu-Speech/awesome-speech-enhancement
speech enhancement\speech seperation\sound source localization
1k 43 1221
microsoft/tutel
Tutel MoE: An Optimized Mixture-of-Experts Implementation
Language:Python715 14 6188
magenta/ddsp-vst
Realtime DDSP Neural Synthesizer and Effect
Language:C++709 40 4669
FuxiVirtualHuman/styletalk
Language:Python501 59 2150
MasayaKawamura/MB-iSTFT-VITS
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform
Language:Python417 17 2664
mpariente/pystoi
Python implementation of the Short Term Objective Intelligibility measure
Language:MATLAB319 13 1960
zhangyongmao/VISinger2
VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer
Language:Python309 12 2342
haoheliu/audioldm_eval
This toolbox aims to unify audio generation model evaluation for easier comparison.
Language:Python291 5 931
interactiveaudiolab/penn
Pitch Estimating Neural Networks (PENN)
Language:Python229 9 1221
adobe-research/convmelspec
Convmelspec: Convertible Melspectrograms via 1D Convolutions
Language:Python131 11 59
tango4j/Auto-Tuning-Spectral-Clustering
This repo is for the SPL paper "Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap"
Language:Python105 7 215
desh2608/gss
A simple package for Guided source separation (GSS)
Language:Python104 5 813
BUTSpeechFIT/EEND
Language:Python71 8 89
fss1t/CausalStarGANv2-VC
Language:Python22 2 76
Nathan-Roll1/PSST
Prosodic Speech Segmentation with Transformers
Language:Jupyter Notebook22 4 25

wsstriving

wsstriving's Stars

ggerganov/whisper.cpp

tatsu-lab/stanford_alpaca

mli/paper-reading

Dao-AILab/flash-attention

m-bain/whisperX

modelscope/FunASR

facebookresearch/encodec

spotify/basic-pitch

openai/improved-diffusion

lucidrains/musiclm-pytorch

microsoft/torchscale

enhuiz/vall-e

haoheliu/AudioLDM

lucidrains/audiolm-pytorch

lucidrains/lion-pytorch

WenzheLiu-Speech/awesome-speech-enhancement

microsoft/tutel

magenta/ddsp-vst

FuxiVirtualHuman/styletalk

MasayaKawamura/MB-iSTFT-VITS

mpariente/pystoi

zhangyongmao/VISinger2

haoheliu/audioldm_eval

interactiveaudiolab/penn

adobe-research/convmelspec

tango4j/Auto-Tuning-Spectral-Clustering

desh2608/gss

BUTSpeechFIT/EEND

fss1t/CausalStarGANv2-VC

Nathan-Roll1/PSST