/grokking

Trying to reproduce experiments from the paper "Grokking: Generalization beyond Overfitting on small algorithmic datasets".

Primary LanguageJupyter NotebookMIT LicenseMIT

Stargazers