dchenam/sequence-labeling

classifying TIMIT phonemes using LSTMs

Python

Sequence Labeling

Description

use TIMIT dataset to predict phoneme sequences using provided mfcc or fbank features

Project Link

Requirements

keras
tensorflow
python3
h5py
sklearn

Dataset

TIMIT Dataset
Features: mfcc and fbank
Labels: 48 kinds of phones

Pre-Processing

Label Preprocessing

phone mapping 48 -> 39
converting sequences to one hot encodings
padding

Features Preprocessing

standardization
padding

Post-Processing

convert phoneme to alphabet
remove consecutive duplicates using a threshold
trim the 'sil' character

Results