lunar333's Stars
asteroid-team/asteroid
The PyTorch-based audio source separation toolkit for researchers
resemble-ai/Resemblyzer
A python package to analyze and compare voices with deep learning
dtlnor/stable-diffusion-webui-localization-zh_CN
Simplified Chinese translation extension for AUTOMATIC1111's stable diffusion webui
bshall/hubert
HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion
facebookresearch/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
THUDM/ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
resautu/chat-with-Elysia
serp-ai/bark-with-voice-clone
🔊 Text-prompted Generative Audio Model - With the ability to clone voices
vocaliodmiku/wav2vec2mdd
End-to-End Mispronunciation Detection via wav2vec2.0
b04901014/FT-w2v2-ser
Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Renovamen/Speech-Emotion-Recognition
Speech emotion recognition implemented in Keras (LSTM, CNN, SVM, MLP) | 语音情感识别
DemisEom/SpecAugment
A Implementation of SpecAugment with Tensorflow & Pytorch, introduced by Google Brain
cageyoko/CTC-Attention-Mispronunciation
A Full Text-Dependent End to End Mispronunciation Detection and Diagnosis with Easy Data Augment Techniques
xmu-xiaoma666/External-Attention-pytorch
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐
b04901014/FG-transformer-TTS
Official implementation for the paper Fine-grained style control in transformer-based text-to-speech synthesis.
CjangCjengh/vits
VITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
innnky/emotional-vits
无需情感标注的情感可控语音合成模型,基于VITS
KinglittleQ/GST-Tacotron
A PyTorch implementation of Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis
Plachtaa/VITS-fast-fine-tuning
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
prophesier/diff-svc
Singing Voice Conversion via diffusion model
TParcollet/E2E-SincNet
E2E-SincNet: Toward fully end-to-end speech recognition
CjangCjengh/MoeGoe
Executable file for VITS inference