Pinned Repositories
Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
academic-kickstart
My Academic Homepage
ASRdys
ASR for dysarthric speakers with Kaldi
AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
awesome-embodied-vision
Reading list for research topics in embodied vision
Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
DYGANVC
source code for "DYGAN-VC: IMPROVING SPEECH CONTENT PRESERVATION FOR GAN VOICE CONVERSION USING DYNAMIC CONVOLUTION"
ppg-vc
PPG-Based Voice Conversion
Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
Mortyzhou-Shef-BIT's Repositories
Mortyzhou-Shef-BIT/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Mortyzhou-Shef-BIT/ppg-vc
PPG-Based Voice Conversion
Mortyzhou-Shef-BIT/Speech-Resources
语音方向实验室/公司/资源/实习等,欢迎推荐或自荐
Mortyzhou-Shef-BIT/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Mortyzhou-Shef-BIT/awesome-embodied-vision
Reading list for research topics in embodied vision
Mortyzhou-Shef-BIT/Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
Mortyzhou-Shef-BIT/DYGANVC
source code for "DYGAN-VC: IMPROVING SPEECH CONTENT PRESERVATION FOR GAN VOICE CONVERSION USING DYNAMIC CONVOLUTION"
Mortyzhou-Shef-BIT/speech-synthesis-paper
List of speech synthesis papers.
Mortyzhou-Shef-BIT/Awesome-Cloud-Edge-AI
A curated list of research in System for Edge Intelligence and Computing(Edge MLSys), including Frameworks, Tools, Repository, etc. Paper notes are also provided.
Mortyzhou-Shef-BIT/CMU-MultimodalSDK
CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.
Mortyzhou-Shef-BIT/crank
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Mortyzhou-Shef-BIT/dialog_evaluation_paper_list
Dialog Evaluation Paper List: include multiple different dialog tasks
Mortyzhou-Shef-BIT/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Mortyzhou-Shef-BIT/espnet_model_zoo
ESPnet Model Zoo
Mortyzhou-Shef-BIT/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Mortyzhou-Shef-BIT/FastVocoder
Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.
Mortyzhou-Shef-BIT/gdown
Download a large file from Google Drive (curl/wget fails because of the security notice).
Mortyzhou-Shef-BIT/HiSD
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement" (CVPR 2021 Oral).
Mortyzhou-Shef-BIT/Pytorch-MBNet
A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK
Mortyzhou-Shef-BIT/reentry
Mortyzhou-Shef-BIT/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
Mortyzhou-Shef-BIT/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Mortyzhou-Shef-BIT/SpeechTransProgress
Tracking the progress in end-to-end speech translation
Mortyzhou-Shef-BIT/StarGANv2-VC
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Mortyzhou-Shef-BIT/Talking-Face_PC-AVS
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)
Mortyzhou-Shef-BIT/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Mortyzhou-Shef-BIT/tango
Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model"
Mortyzhou-Shef-BIT/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Mortyzhou-Shef-BIT/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Mortyzhou-Shef-BIT/VQMIVC
Official implementation of VQMIVC: One-shot Voice Conversion @ Interspeech 2021