Pinned Repositories
acm-icpc
2015 to 2017, ACM-ICPC Training Codes, Team SpadeAce
auto_avsr
Auto-AVSR: Lip-Reading Sentences Project
av_hubert
A self-supervised learning framework for audio-visual speech
awesome-audio-visualization
A curated list about Audio Visualization.
FacePose_pytorch
🔥🔥The pytorch implement of the head pose estimation(yaw,roll,pitch) and emotion detection with SOTA performance in real time.Easy to deploy, easy to use, and high accuracy.Solve all problems of face detection at one time.(极简,极快,高效是我们的宗旨)
Lipreading_using_Temporal_Convolutional_Networks
ICASSP'20 Lipreading using Temporal Convolutional Networks
learn-an-effective-lip-reading-model-without-pains
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the state-of-art performance in LRW-1000 dataset.
LipNet-PyTorch
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)
Lipreading-DenseNet3D
DenseNet3D Model In "LRW-1000: A Naturally-Distributed Large-Scale Benchmark for Lip Reading in the Wild", https://arxiv.org/abs/1810.06990
Fengdalu's Repositories
Fengdalu/FacePose_pytorch
🔥🔥The pytorch implement of the head pose estimation(yaw,roll,pitch) and emotion detection with SOTA performance in real time.Easy to deploy, easy to use, and high accuracy.Solve all problems of face detection at one time.(极简,极快,高效是我们的宗旨)
Fengdalu/Lipreading_using_Temporal_Convolutional_Networks
ICASSP'20 Lipreading using Temporal Convolutional Networks
Fengdalu/auto_avsr
Auto-AVSR: Lip-Reading Sentences Project
Fengdalu/av_hubert
A self-supervised learning framework for audio-visual speech
Fengdalu/awesome-audio-visualization
A curated list about Audio Visualization.
Fengdalu/Awesome-Video-Datasets
Video datasets
Fengdalu/bark
🔊 Text-Prompted Generative Audio Model
Fengdalu/chinese_text_normalization
Chinese text normalization for speech processing
Fengdalu/DeepFaceLab
DeepFaceLab is the leading software for creating deepfakes.
Fengdalu/examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
Fengdalu/fairscale
PyTorch extensions for high performance and large scale training.
Fengdalu/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Fengdalu/gpu-burn
Multi-GPU CUDA stress test
Fengdalu/lightning-bolts
Toolbox of models, callbacks, and datasets for AI/ML researchers.
Fengdalu/LRW_ID
The speaker-labeled information of LRW dataset, which is the outcome of the paper "Speaker-adaptive Lip Reading with User-dependent Padding" (ECCV 2022)
Fengdalu/mdistiller
The official implementation of [CVPR2022] Decoupled Knowledge Distillation https://arxiv.org/abs/2203.08679
Fengdalu/nvcodec-python
Fengdalu/nvjpeg-python
nvjpeg for python
Fengdalu/RGB_HSV_HSL
a pure pytorch implementation of color space conversion, including rgb2hsl, rgb2hsv, hsv2rgb, hsl2rgb
Fengdalu/SCPapers
Must-read Papers on Sememe Computation
Fengdalu/Speech-Transformer
A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.
Fengdalu/stanfordacm
Stanford ACM-ICPC related materials
Fengdalu/stargan
StarGAN - Official PyTorch Implementation (CVPR 2018)
Fengdalu/torchnvjpeg
Decode JPEG image on GPU using PyTorch
Fengdalu/VITS-Paimon
Fengdalu/Wave-U-Net-for-Speech-Enhancement
Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
Fengdalu/wenet
Production First and Production Ready End-to-End Speech Recognition Toolkit
Fengdalu/whisper
Robust Speech Recognition via Large-Scale Weak Supervision
Fengdalu/whisperX
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Fengdalu/yolov5-face