/transformer_shmap

Tensor Parallelism with JAX + Shard Map

Primary LanguagePythonMIT LicenseMIT

Stargazers