danielfmsouza/JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
PythonApache-2.0
Stargazers
No one’s star this repository yet.
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
PythonApache-2.0
No one’s star this repository yet.