The repository contains scripts and instructions for training TURNA, a Turkish Encoder-Decoder Language Model using the t5x
framework on the Google Cloud Platform.
The repository is organized as follows:
dataset_preparation/
: Scripts and instructions for preparing the datasets used in the project.training/
: Scripts and instructions for training.t5x_to_hf_conversion/
: Scripts and instructions for converting at5x
model to a Hugging Face model.
Each directory contains a README.md
file with detailed instructions.