wsstriving

Shanghai Jiao Tong University

wsstriving's Stars

Vision-CAIR/MiniGPT-4
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
Language:Python25.3k 218 4592.9k
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
Language:Python20.7k 205 3752.1k
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python12k 170 233816
SJTU-IPADS/PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Language:C++7.9k 77 161406
pyannote/pyannote-audio
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Language:Jupyter Notebook6k 71 989758
XuehaiPan/nvitop
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
Language:Python4.7k 25 83148
togethercomputer/RedPajama-Data
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Language:Python4.5k 76 89346
wenet-e2e/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Language:Python4.1k 90 1k1.1k
Jittor/jittor
Jittor is a high-performance deep learning framework based on JIT compiling and meta-operators.
Language:Python3.1k 64 354310
haoheliu/AudioLDM2
Text-to-Audio/Music Generation
Language:Python2.3k 45 70177
lucidrains/naturalspeech2-pytorch
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
Language:Python1.3k 53 3199
descriptinc/descript-audio-codec
State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.
Language:Python1.1k 27 74106
gemelo-ai/vocos
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Language:Python771 33 4688
wenet-e2e/wespeaker
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Language:Python686 19 113116
DmitryRyumin/INTERSPEECH-2023-24-Papers
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!
634 88 442
LAION-AI/audio-dataset
Audio Dataset for training CLAP and other models
Language:Python618 21 5853
ZhangXInFD/SpeechTokenizer
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
Language:Python435 15 1439
Zain-Jiang/Speech-Editing-Toolkit
It's a repository for implementations of neural speech editing algorithms.
Language:Python187 9 2419
zhenye234/CoMoSpeech
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Language:Python177 12 1118
mct10/RepCodec
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
Language:Python147 14 610
Grace9994/CoMoSVC
CoMoSVC: One-Step Consistency Model Based Singing Voice Conversion & Singing Voice Clone
Language:Python126 3 1218
X-LANCE/UniCATS-CTX-vec2wav
[AAAI 2024] Code for CTX-vec2wav in UniCATS
Language:Python117 10 916
X-LANCE/UniCATS-CTX-txt2vec
[AAAI 2024] CTX-txt2vec, the acoustic model in UniCATS
Language:Python60 7 118
DigitalPhonetics/speaker-anonymization
Speaker anonymization pipeline for hiding the identity of the speaker of a recording by changing the voice in it.
Language:Shell57 7 54
pengzhendong/pyannote-onnx
ONNX Inference of Pyannote Segmentation
Language:Python57 5 715
TomJwYu/WenetSpeechSpeakerCluster
55 5 42
yuguochencuc/SF-Net
The implementation of "Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Real-Time Full-Band Speech Enhancement"
Language:Python50 2 19
HuangZiliAndy/SSL_for_multitalker
ADAPTING SELF-SUPERVISED MODELS TO MULTI-TALKER SPEECH RECOGNITION USING SPEAKER EMBEDDINGS
Language:Shell26 3 11
nan-bean/dcnn-sv
An implementation of dynamic convolution algorithm based system for speaker verification.
Language:Python5 1 0
npuichigo/snake
Data loading with combined async Rust stream and Python
Language:Rust5 2 0

wsstriving

wsstriving's Stars

Vision-CAIR/MiniGPT-4

facebookresearch/audiocraft

openai/tiktoken

SJTU-IPADS/PowerInfer

pyannote/pyannote-audio

XuehaiPan/nvitop

togethercomputer/RedPajama-Data

wenet-e2e/wenet

Jittor/jittor

haoheliu/AudioLDM2

lucidrains/naturalspeech2-pytorch

descriptinc/descript-audio-codec

gemelo-ai/vocos

wenet-e2e/wespeaker

DmitryRyumin/INTERSPEECH-2023-24-Papers

LAION-AI/audio-dataset

ZhangXInFD/SpeechTokenizer

Zain-Jiang/Speech-Editing-Toolkit

zhenye234/CoMoSpeech

mct10/RepCodec

Grace9994/CoMoSVC

X-LANCE/UniCATS-CTX-vec2wav

X-LANCE/UniCATS-CTX-txt2vec

DigitalPhonetics/speaker-anonymization

pengzhendong/pyannote-onnx

TomJwYu/WenetSpeechSpeakerCluster

yuguochencuc/SF-Net

HuangZiliAndy/SSL_for_multitalker

nan-bean/dcnn-sv

npuichigo/snake