MisakaMikoto96's Stars
Significant-Gravitas/AutoGPT
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
PlexPt/awesome-chatgpt-prompts-zh
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
suno-ai/bark
🔊 Text-Prompted Generative Audio Model
svc-develop-team/so-vits-svc
SoftVC VITS Singing Voice Conversion
facebookresearch/audiocraft
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.
EmbraceAGI/awesome-chatgpt-zh
ChatGPT 中文指南🔥,ChatGPT 中文调教指南,指令指南,应用开发指南,精选资源清单,更好的使用 chatGPT 让你的生产力 up up up! 🚀
AIGC-Audio/AudioGPT
AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head
THUDM/VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
collabora/WhisperSpeech
An Open Source text-to-speech system built by inverting Whisper.
yuanzhoulvpi2017/zero_nlp
中文nlp解决方案(大模型、数据、模型、训练、推理)
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
KaiyangZhou/CoOp
Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22)
microsoft/SpeechT5
Unified-Modal Speech-Text Pre-Training for Spoken Language Processing
yzhuoning/Awesome-CLIP
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
TencentGameMate/chinese_speech_pretrain
chinese speech pretrained models
cjyaddone/ChatWaifu
Combined ChatGPT with Moegoe TTS to create a Chatting Waifu
yeyupiaoling/PPASR
基于PaddlePaddle实现端到端中文语音识别,从入门到实战,超简单的入门案例,超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型
gitmylo/bark-voice-cloning-HuBERT-quantizer
The code for the bark-voicecloning model. Training and inference.
clue-ai/PromptCLUE
PromptCLUE, 全中文任务支持零样本学习模型
facebookresearch/WavAugment
A library for speech data augmentation in time-domain
yangdongchao/AcademiCodec
AcademiCodec: An Open Source Audio Codec Model for Academic Research
auspicious3000/contentvec
speech self-supervised representations
sophiefy/Sovits
An unofficial implementation of the combination of Soft-VC and VITS
descriptinc/lyrebird-wav2clip
Official implementation of the paper WAV2CLIP: LEARNING ROBUST AUDIO REPRESENTATIONS FROM CLIP
Rongjiehuang/GenerSpeech
PyTorch Implementation of GenerSpeech (NeurIPS'22): a text-to-speech model towards zero-shot style transfer of OOD custom voice.
hche11/VGGSound
VGGSound: A Large-scale Audio-Visual Dataset
microsoft/Pengi
An Audio Language model for Audio Tasks
aoifemcdonagh/audioset-processing
Toolkit for downloading and processing Google's AudioSet dataset.
Moon0316/T2A
Project page for "Improving Few-shot Learning for Talking Face System with TTS Data Augmentation" for ICASSP2023
gitmylo/bark-data-gen
Create training data for training a voice cloner for bark text to speech.