Pinned Repositories
asv-subtools
An Open Source Tools for Speaker Recognition
AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
cube-studio
云原生一站式机器学习平台,多租户,数据资产,notebook在线开发,拖拉拽任务流编排,多机多卡分布式训练,超参搜索,推理服务,多集群调度,多项目组资源组,边缘计算,大模型实时训练, ai应用商店
fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
kaldi_org
This is now the official location of the Kaldi project.
LeetcodeTop
汇总各大互联网公司容易考察的高频leetcode题🔥 推荐刷题网站:https://www.lintcode.com/?utm_source=tf-github-codetop
wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
wetts
Production First and Production Ready End-to-End Text-to-Speech Toolkit
whisper-jax
whisper faster inference
zh-google-styleguide
Google 开源项目风格指南 (中文版)
donstang's Repositories
donstang/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
donstang/whisper-jax
whisper faster inference
donstang/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
donstang/audio-diffusion-pytorch
Audio generation using diffusion models, in PyTorch.
donstang/audiocraft_meta
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
donstang/BELLE
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)
donstang/CTranslate2
Fast inference engine for Transformer models
donstang/dcgm-exporter
NVIDIA GPU metrics exporter for Prometheus leveraging DCGM
donstang/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
donstang/DiffSinger
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
donstang/g2pW
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
donstang/gradio
Create UIs for your machine learning model in Python in 3 minutes
donstang/KAN-TTS
KAN-TTS is a speech-synthesis training framework, please try the demos we have posted at https://modelscope.cn/models?page=1&tasks=text-to-speech
donstang/Leaderboard
SpeechIO Leaderboard: a large, robust, comprehensive, benchmarking platform for Automatic Speech Recognition.
donstang/MOSS
An open-source tool-augmented conversational language model from Fudan University
donstang/PaddleSpeech
Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
donstang/RedPajama-Data-LLM
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
donstang/riffusion
Stable diffusion for real-time music generation
donstang/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
donstang/so-vits-svc
SoftVC VITS Singing Voice Conversion
donstang/speechmetrics_tts_eval
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
donstang/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
donstang/SpokenNLP
meeting nlp processing
donstang/tango
Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model"
donstang/tts-bark
🔊 Text-prompted Generative Audio Model
donstang/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in https://plachtaa.github.io
donstang/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
donstang/whisper
openai
donstang/Whisper-Finetune
微调Whisper语音识别模型和加速推理
donstang/whisper.cpp
Port of OpenAI's Whisper model in C/C++