Re-implementation of paper https://arxiv.org/pdf/2010.14233

image

Tested with transformers BertModel and used NeMo toolkit ASR as encoder. Currently under development.

Supports pytorch lighning and uses NeMo speech container from NGC library