/ilo-sona

Small language model for toki pona.

Primary LanguagePythonMIT LicenseMIT

Great! Here's a draft README for the ilo-pona repository:


ilo-pona: A Transformer for Toki Pona

![Banner Image](Banner image URL)

Overview

ilo-pona is an experimental project dedicated to creating a transformer model tailored to understand and generate text in Toki Pona, a minimalistic constructed language. The core motivation behind this project lies in the unique characteristics of Toki Pona - its simple semantics (one word per token) and severely limited vocabulary size. These features make it an excellent toy model for investigating training dynamics of large language models (LLMs) and conducting mechanistic interpretability research.

The corpus is built on top of the corpus compiled by davidar.

Project Maintainers

Getting Started

Prerequisites

This project requires Python 3.9. Please ensure you have the correct version installed before proceeding.

Installation

Instructions on how to set up the project locally will be provided soon.

Useful Links

Contributing

Details on how to contribute to ilo-pona will be provided soon.

License

[License Name](License URL)


Remember, you'll need to replace Banner image URL and License Name/URL with appropriate details.