wnhsu
Research Scientist @ Facebook AI Research (FAIR). Former PhD Student @ MIT Spoken Language Systems Group
Pinned Repositories
FactorizedHierarchicalVAE
This repository contains the code to reproduce the core results from the paper "Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data"
PGLSTM_ASR
This repo contains codes to reproduce the core results of "A Prioritized Grid Long Short-Term Memory RNN for Speech Recognition"
ResDAVEnet-VQ
Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"
ReVISE
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
ScalableFHVAE
This repository contains the code to reproduce the core results from the paper "Scalable Factorized Hierarchical Variational Autoencoders"
semi-supervised-pytorch
Implementations of different VAE-based semi-supervised and generative models in PyTorch
SpeechVAE
This repository contains the code to reproduce the core results from the paper "Learning Latent Representations for Speech Generation and Transformation".
tacotron2_dev
tensorflow-wavenet
A TensorFlow implementation of DeepMind's WaveNet paper
wavenet_vocoder
WaveNet vocoder
wnhsu's Repositories
wnhsu/FactorizedHierarchicalVAE
This repository contains the code to reproduce the core results from the paper "Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data"
wnhsu/ScalableFHVAE
This repository contains the code to reproduce the core results from the paper "Scalable Factorized Hierarchical Variational Autoencoders"
wnhsu/SpeechVAE
This repository contains the code to reproduce the core results from the paper "Learning Latent Representations for Speech Generation and Transformation".
wnhsu/ResDAVEnet-VQ
Official codes for the paper "Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech"
wnhsu/ReVISE
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement
wnhsu/PGLSTM_ASR
This repo contains codes to reproduce the core results of "A Prioritized Grid Long Short-Term Memory RNN for Speech Recognition"
wnhsu/semi-supervised-pytorch
Implementations of different VAE-based semi-supervised and generative models in PyTorch
wnhsu/tensorflow-wavenet
A TensorFlow implementation of DeepMind's WaveNet paper
wnhsu/tacotron2_dev
wnhsu/wavenet_vocoder
WaveNet vocoder
wnhsu/ZeroSpeech2019_RLE_eval
ZeroSpeech 2019 evaluation with run-length encoding (RLE), metrics reported in ResDAVEnet-VQ.
wnhsu/a-PyTorch-Tutorial-to-Image-Captioning
Show, Attend, and Tell | a PyTorch Tutorial to Image Captioning
wnhsu/ABXpy
ABX discrimination task in python
wnhsu/CNTK
Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
wnhsu/einops
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
wnhsu/espnet_tts_frontend
Text frontend for ESPnet tts recipes
wnhsu/fairseq
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
wnhsu/image-to-speech-demo
wnhsu/kaldi
kaldi-asr/kaldi is the official location of the Kaldi project.
wnhsu/show-attend-and-tell
TensorFlow Implementation of "Show, Attend and Tell"
wnhsu/wav2letter
Facebook AI Research's Automatic Speech Recognition Toolkit
wnhsu/wnhsu.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes