- Tokenization is based on the Magenta Team's work
- Can we perform piano transcription using Listen Attend and Spell model?
- MAESTRO dataset V1.0
- 512 Mel bin with 10ms Resolution
- MIDI tokenziation with absolute time shift for each segments
- Vocab contains velocity, Note and time stamp
- 4 Layers of BI-LSTM
- Additive attention with 1 Layers of Uni-LSTM with linear-layer
- Requries code Refactoring
- This repo is not final version