TrangDuLam/TEDIUM_AHG

A Speech recognition baseline model base on Kaldi toolkit and Tedlium dataset

ShellMIT

TED-LIUM AHG Speech Recognition Baseline

Introduction

A Kaldi baseline model adapted to personal/non-clustering machines.

Version Released

s5_r1
1. Revised from original egs/tedlium/s5_r1 script
2. Using updated Kaldi library
s5_r2
1. A tutorial example for undergraduate students
2. Based on AISHELL-1 framework
3. Using TED-LIUM 1 corpus

TODO

To enable GPU exclusive mode via command "sudo nvidia-smi -c 3"
Type "screen" in command prompt to enter background process
cd s5_r2
Configure the variable exactDataDir in path.sh (The desired dataset location)
bash run.sh

Learning Objectives

These are several hints for you to prepare your in-lab meeting presentation.

Basic
1. Which type of features are used?
2. Overall model building flow (To explain each step)
3. Arguments of each line of code (file I/O)
4. How to decode other wave files
5. How to know the overall performance
Intermediate
1. To visualize features (MFCC files, ivectors)
2. To decode graph, decision trees
3. Knowing the structure of neural networks