LeeYongHyeok

Pinned Repositories

Whisper-Rocognition
Language:Python0 1 00
avsr-chain
Audio-Visual Speech Recognition using chain model in kaldi toolkit
Language:Shell2 1 10
DCM_vgg_transformer
Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using fairseq
Language:Python12 1 20
dual_cross_modality-AVSR
The audio visual speech recognition model which dual cross modality attention based on sigmedia-AVSR code
Language:Python1 1 00
kaldi
This is the official location of the Kaldi project.
Language:Shell0 0 00
PerceptualAudio
Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM
Language:Python0 0 00
pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
Language:Python0 0 00
PhonMatchNet
Official implementation of "PhonMatchNet: Phoneme-Guided Zero-Shot Keyword Spotting for User-Defined Keywords" (INTERSPEECH 2023)
Language:Python41 3 117

LeeYongHyeok's Repositories

LeeYongHyeok/DCM_vgg_transformer
Dual cross modality attention audio-visual speech recognition model based on vgg transformer with hybrid CTC/attention architecture using fairseq
Language:Python12 1 20
LeeYongHyeok/avsr-chain
Audio-Visual Speech Recognition using chain model in kaldi toolkit
Language:Shell2 1 10
LeeYongHyeok/dual_cross_modality-AVSR
The audio visual speech recognition model which dual cross modality attention based on sigmedia-AVSR code
Language:Python1 1 00
LeeYongHyeok/kaldi
This is the official location of the Kaldi project.
Language:Shell0 0 00
LeeYongHyeok/PerceptualAudio
Perceptual Metrics of Audio - perceptually relevant loss function. DPAM and CDPAM
Language:Python0 0 00
LeeYongHyeok/pytorch-tutorial
PyTorch Tutorial for Deep Learning Researchers
Language:Python0 0 00