/ScTD

Revisiting Token Dropping Strategy in Efficient BERT Pretraining

Primary LanguagePython

Stargazers