yshnny
Homepage-A:https://vipl.ict.ac.cn/people/~syang Homepage-B:https://scholar.google.com/citations?user=8wizL74AAAAJ&hl=en
Institute of Computing Technology, Chinese Academy of Sciences
yshnny's Stars
tensorflow/models
Models and examples built with TensorFlow
tensorpack/tensorpack
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility
nailperry-zd/The-Economist
The Economist 经济学人,持续更新
torch/nn
mpc001/Visual_Speech_Recognition_for_Multiple_Languages
Visual Speech Recognition for Multiple Languages
VIPL-Audio-Visual-Speech-Understanding/LipNet-PyTorch
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)
mpc001/end-to-end-lipreading
Pytorch code for End-to-End Audiovisual Speech Recognition
VIPL-Audio-Visual-Speech-Understanding/learn-an-effective-lip-reading-model-without-pains
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the state-of-art performance in LRW-1000 dataset.
yikun2019/PENCIL
PyTorch implementation of Probabilistic End-to-end Noise Correction for Learning with Noisy Labels, CVPR 2019.
VIPL-Audio-Visual-Speech-Understanding/Lipreading-DenseNet3D
DenseNet3D Model In "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild", https://arxiv.org/abs/1810.06990
YuanTingHsieh/TF_TCN
Tensorflow Temporal Convolutional Network
sailordiary/LipNet-PyTorch
"LipNet: End-to-End Sentence-level Lipreading" in PyTorch
VIPL-Audio-Visual-Speech-Understanding/AVSU-VIPL
Collection of works from VIPL-AVSU
NirHeaven/D3D
The proposed method in LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild
xing96/MIM-lipreading
Code and model for paper <Mutual Information Maximization for Effective Lip Reading>
jingyunx/Deformation-Flow-Based-Two-stream-Network-for-Lip-Reading
NirHeaven/tensorflow-Toolkits
some useful toolkits wrapped from tensorflow API which includes some common nn models(such as resnet), nn layer operations(such as attention decoder), tensor operations(such as sparse) and some strategy to configure learning rate and optimizer. we also provide a pipeline to train nn models parallelly and easily
Metaverse-AI-Lab-THU/Deep-Personalized-Character-Dataset-DPCD
We contribute a multimodal dialogue dataset, named Deep Personalized Character Dataset (DPCD), from TV shows, which contains a large number of character-specific text, audio and video dialogue data with ~10k utterances and ~6 hours audio and video per character.