salute-developers/GigaAM

Train from scratch on the datasets used for finetune

Opened this issue · 0 comments

Hi!

Curious, do you provide baselines/checkpoints where you train from scratch on Golos+Sova+RCV+RLS including some models like FastConformer (hybrid CTC+RNNT)?

It would be helpful repro baselines, given that nvidia does not provide full training/data prep scripts for their public FastConformer models, and this baseline can probably be run without a ton of resources and still be useful for experimenting with the model architecture (e.g. positional encoding used)

Thanks!