AI-Hypercomputer/JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
PythonApache-2.0
Issues
- 0
Does Dataflow work with JetStream?
#146 opened by salaki - 0
- 0
Ads.google.com
#144 opened by barbaryyazd09133545880 - 0
Question: `prometheus_port` flag for pytorch server
#143 opened by JeffLuoo - 0
باربری یزد ۰۹۱۳۳۵۴۵۸۸۰
#142 opened by barbaryyazd09133545880 - 0
باربری یزد 09133545880
#141 opened by barbaryyazd09133545880 - 0
Support using models from HuggingFace directly
#140 opened by samos123 - 0
باربری نیسان یزد 09133545880
#139 opened by barbaryyazd09133545880 - 0
- 0
Support completions API
#135 opened by nstogner - 2
Clean up Model Conversion Script
#131 opened by yeandy - 1
when to support gpu?
#120 opened by Mddct - 0
Remove jax dependencies in JetStream
#88 opened by FanhaiLu1 - 1
Add np padding support
#55 opened by FanhaiLu1 - 2
Support I/O with text and token ids
#79 opened by JoeZijunZhou - 1
Refactor jestream to allow different tokenizers
#45 opened by qihqi - 2
Detokenize error
#64 opened by yeandy - 1
- 2
float division by zero in benchmark
#61 opened by FanhaiLu1 - 2
Support on Huggingface transformers
#44 opened by ImKeTT - 1
Error with mutable list value in dataclass
#57 opened by yeandy - 1
CogVLM support
#46 opened by BitPhinix - 5
Feature request: improve documentation
#14 opened by OhadRubin