use TIMIT dataset to predict phoneme sequences using provided mfcc or fbank features
Project Link
- keras
- tensorflow
- python3
- h5py
- sklearn
- TIMIT Dataset
- Features: mfcc and fbank
- Labels: 48 kinds of phones
Label Preprocessing
- phone mapping 48 -> 39
- converting sequences to one hot encodings
- padding
Features Preprocessing
- standardization
- padding
- convert phoneme to alphabet
- remove consecutive duplicates using a threshold
- trim the 'sil' character