/Transformer-Workflow

Generic workflow of Seq-to-seq data

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Transformers-workflow

Templated code to use to train transformer models on new data. Just pick a new model, process data, handle data loading and tokenization.
The code was tested on predicting gradients of a math equation. Data being private, cannot reveal the final results but goes accuracy up to 70% on exact matches on really small models.