Mortyzhou-Shef-BIT

Living with attention is all we need.

UoS -> NUS & BIT

Pinned Repositories

Speech-Resources
语音方向实验室/公司/资源/实习等，欢迎推荐或自荐
441 20 1660
academic-kickstart
My Academic Homepage
Language:Shell00
ASRdys
ASR for dysarthric speakers with Kaldi
Language:Shell00
AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language:Python0 0 00
awesome-embodied-vision
Reading list for research topics in embodied vision
0 0 00
Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
Language:Python0 0 00
Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
1 0 00
DYGANVC
source code for "DYGAN-VC: IMPROVING SPEECH CONTENT PRESERVATION FOR GAN VOICE CONVERSION USING DYNAMIC CONVOLUTION"
Language:Python0 0 00
ppg-vc
PPG-Based Voice Conversion
Language:Python1 0 00
Speech-Resources
语音方向实验室/公司/资源/实习等，欢迎推荐或自荐
1 0 00

Mortyzhou-Shef-BIT's Repositories

Mortyzhou-Shef-BIT/Awesome-Transformer-Attention
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
1 0 00
Mortyzhou-Shef-BIT/ppg-vc
PPG-Based Voice Conversion
Language:Python1 0 00
Mortyzhou-Shef-BIT/Speech-Resources
语音方向实验室/公司/资源/实习等，欢迎推荐或自荐
1 0 00
Mortyzhou-Shef-BIT/AudioLDM
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Language:Python0 0 00
Mortyzhou-Shef-BIT/awesome-embodied-vision
Reading list for research topics in embodied vision
0 0 00
Mortyzhou-Shef-BIT/Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
Language:Python0 0 00
Mortyzhou-Shef-BIT/DYGANVC
source code for "DYGAN-VC: IMPROVING SPEECH CONTENT PRESERVATION FOR GAN VOICE CONVERSION USING DYNAMIC CONVOLUTION"
Language:Python0 0 00
Mortyzhou-Shef-BIT/speech-synthesis-paper
List of speech synthesis papers.
0 0 00
Mortyzhou-Shef-BIT/Awesome-Cloud-Edge-AI
A curated list of research in System for Edge Intelligence and Computing(Edge MLSys), including Frameworks, Tools, Repository, etc. Paper notes are also provided.
0 0
Mortyzhou-Shef-BIT/CMU-MultimodalSDK
CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.
Language:Python0 0
Mortyzhou-Shef-BIT/crank
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Language:Python0 0
Mortyzhou-Shef-BIT/dialog_evaluation_paper_list
Dialog Evaluation Paper List: include multiple different dialog tasks
0 0
Mortyzhou-Shef-BIT/diffwave
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Language:Python0 0
Mortyzhou-Shef-BIT/espnet_model_zoo
ESPnet Model Zoo
Language:Python0 0
Mortyzhou-Shef-BIT/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Language:Python0 0
Mortyzhou-Shef-BIT/FastVocoder
Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.
Language:Python0 0
Mortyzhou-Shef-BIT/gdown
Download a large file from Google Drive (curl/wget fails because of the security notice).
Language:Python0 0
Mortyzhou-Shef-BIT/HiSD
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement" (CVPR 2021 Oral).
Language:Python0 0
Mortyzhou-Shef-BIT/Pytorch-MBNet
A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK
Language:Python0 0
Mortyzhou-Shef-BIT/reentry
Language:Python0 0
Mortyzhou-Shef-BIT/s3prl
Self-Supervised Speech Pre-training and Representation Learning Toolkit.
Language:Python0 0
Mortyzhou-Shef-BIT/speechmetrics
A wrapper around speech quality metrics MOSNet, BSSEval, STOI, PESQ, SRMR, SISDR
Language:Python0 0
Mortyzhou-Shef-BIT/SpeechTransProgress
Tracking the progress in end-to-end speech translation
0 0
Mortyzhou-Shef-BIT/StarGANv2-VC
StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
Language:Python0 0
Mortyzhou-Shef-BIT/Talking-Face_PC-AVS
Code for Pose-Controllable Talking Face Generation by Implicitly Modularized Audio-Visual Representation (CVPR 2021)
Language:Python0 0
Mortyzhou-Shef-BIT/TalkNet-ASD
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Language:Python0 0
Mortyzhou-Shef-BIT/tango
Codes and Model of the paper "Text-to-Audio Generation using Instruction Tuned LLM and Latent Diffusion Model"
Language:Python0 0
Mortyzhou-Shef-BIT/transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
Language:Python0 0
Mortyzhou-Shef-BIT/TTS
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Language:Python0 0
Mortyzhou-Shef-BIT/VQMIVC
Official implementation of VQMIVC: One-shot Voice Conversion @ Interspeech 2021
Language:Python0 0