Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.
Primary LanguagePython