Pinned issues
Issues
- 1
Support lorax for dynamic lora loading
#215 opened by kerthcet - 1
Support Modular Max as another serving backend
#212 opened by kerthcet - 1
:warning: Action Required: Replace Deprecated gcr.io/kubebuilder/kube-rbac-proxy
#211 opened by camilamacedo86 - 3
- 0
Add TensorRT-LLM support as another backend
#205 opened by kerthcet - 3
Accelerate model loading
#103 opened by kerthcet - 13
ollama support
#91 opened by kerthcet - 0
Support speculative decoding with llama.cpp
#197 opened by kerthcet - 7
- 0
Serverless support
#192 opened by kerthcet - 0
Support to serving Stable Diffusion models
#189 opened by kerthcet - 2
Release v0.0.8
#186 opened by kerthcet - 5
Support TGI as another alternative backend
#72 opened by kerthcet - 6
Loading model weights more efficiently
#119 opened by kerthcet - 8
Downsize the model-loader image
#93 opened by kerthcet - 2
[WebUI] Add support for webui
#160 opened by kerthcet - 2
Support install llmaz at any namespace
#141 opened by kerthcet - 5
Add validation to Playground as the backendConfig's resource requests should not be greater than limits
#135 opened by kerthcet - 3
Model aware scheduling
#96 opened by kerthcet - 2
Is there any early proposal or document about integrating with Gateway API ?
#165 opened by caozhuozi - 2
Bump LWS version to v0.4.0
#161 opened by kerthcet - 4
[Umbrella] Improve test coverages
#156 opened by kerthcet - 3
- 1
Helm uninstall will not delete the CRDs
#153 opened by kerthcet - 4
helm chart support for easy installation
#137 opened by kerthcet - 5
Add new API object BackendRuntime for expandability
#134 opened by kerthcet - 2
Customized flags for backendRuntimes
#140 opened by kerthcet - 3
Benchmark toolkit support
#66 opened by kerthcet - 2
Requests could be bigger than limits
#123 opened by kerthcet - 1
Support traditional models
#133 opened by kerthcet - 3
Report filename and file size in modelLoader
#122 opened by kerthcet - 3
- 3
Unify the API routes for different inference engines
#114 opened by kerthcet - 1
Add new status once models haven't been created
#117 opened by kerthcet - 2
Playground will not reconcile once model created
#92 opened by kerthcet - 1
Support scaling with Spot instances for cost saving
#106 opened by kerthcet - 2
- 3
Download models in prior
#99 opened by kerthcet - 1
Support filesystems
#100 opened by kerthcet - 1
Add more e2e tests
#69 opened by kerthcet - 3
Support llama.cpp as alternative backend
#65 opened by kerthcet - 1
Prompts managements
#90 opened by kerthcet - 1
Failover policy for various backends
#86 opened by kerthcet - 1
Parallel model serving
#85 opened by kerthcet - 4
- 1
Integrate with Kueue for fungibility capacity
#74 opened by kerthcet - 1
Mount /dev/shm for shared memory files
#73 opened by kerthcet - 1
Once name containers dot, failed to create Pods
#67 opened by kerthcet - 0
Milestone v0.1.0
#63 opened by kerthcet - 1
Support different GPU accelerators for fungibility
#62 opened by kerthcet