InftyAI/llmaz

☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!

GoApache-2.0

Pinned issues

Milestone v0.1.0

#63 opened 5 months ago by kerthcet

Open0

[WebUI] Add support for webui

#160 opened 3 months ago by kerthcet

Open2

Issues

Support lorax for dynamic lora loading
#215 opened 24 days ago by kerthcet
1
Support Modular Max as another serving backend
#212 opened a month ago by kerthcet
1
:warning: Action Required: Replace Deprecated gcr.io/kubebuilder/kube-rbac-proxy
#211 opened a month ago by camilamacedo86
1
Install lws controller together with llmaz controller in the same namespace
#202 opened a month ago by kerthcet
3
Add TensorRT-LLM support as another backend
#205 opened a month ago by kerthcet
0
Accelerate model loading
#103 opened a month ago by kerthcet
3
ollama support
#91 opened 2 months ago by kerthcet
13
Support speculative decoding with llama.cpp
#197 opened 2 months ago by kerthcet
0
[ModelLoader] Some huggingface models may contain duplicated weights
#163 opened 2 months ago by kerthcet
7
Serverless support
#192 opened 2 months ago by kerthcet
0
Support to serving Stable Diffusion models
#189 opened 2 months ago by kerthcet
0
Release v0.0.8
#186 opened 2 months ago by kerthcet
2
Support TGI as another alternative backend
#72 opened 3 months ago by kerthcet
5
Loading model weights more efficiently
#119 opened 4 months ago by kerthcet
6
Downsize the model-loader image
#93 opened 3 months ago by kerthcet
8
[WebUI] Add support for webui
#160 opened 3 months ago by kerthcet
2
Support install llmaz at any namespace
#141 opened 3 months ago by kerthcet
2
Add validation to Playground as the backendConfig's resource requests should not be greater than limits
#135 opened 3 months ago by kerthcet
5
Model aware scheduling
#96 opened 4 months ago by kerthcet
3
Is there any early proposal or document about integrating with Gateway API ?
#165 opened 3 months ago by caozhuozi
2
Bump LWS version to v0.4.0
#161 opened 3 months ago by kerthcet
2
[Umbrella] Improve test coverages
#156 opened 4 months ago by kerthcet
4
Bump vllm to 0.6.0 according to the great performance improvement
#128 opened 4 months ago by kerthcet
3
Helm uninstall will not delete the CRDs
#153 opened 4 months ago by kerthcet
1
helm chart support for easy installation
#137 opened 4 months ago by kerthcet
4
Add new API object BackendRuntime for expandability
#134 opened 4 months ago by kerthcet
5
Customized flags for backendRuntimes
#140 opened 4 months ago by kerthcet
2
Benchmark toolkit support
#66 opened 5 months ago by kerthcet
3
Requests could be bigger than limits
#123 opened 4 months ago by kerthcet
2
Support traditional models
#133 opened 4 months ago by kerthcet
1
Report filename and file size in modelLoader
#122 opened 4 months ago by kerthcet
3
Add integration tests for Playground/Service status update
#113 opened 4 months ago by kerthcet
3
Unify the API routes for different inference engines
#114 opened 4 months ago by kerthcet
3
Add new status once models haven't been created
#117 opened 4 months ago by kerthcet
1
Playground will not reconcile once model created
#92 opened 4 months ago by kerthcet
2
Support scaling with Spot instances for cost saving
#106 opened 4 months ago by kerthcet
1
Always download the model weights when pod starts
#88 opened 4 months ago by kerthcet
2
Download models in prior
#99 opened 4 months ago by kerthcet
3
Support filesystems
#100 opened 4 months ago by kerthcet
1
Add more e2e tests
#69 opened 4 months ago by kerthcet
1
Support llama.cpp as alternative backend
#65 opened 4 months ago by kerthcet
3
Prompts managements
#90 opened 4 months ago by kerthcet
1
Failover policy for various backends
#86 opened 4 months ago by kerthcet
1
Parallel model serving
#85 opened 4 months ago by kerthcet
1
Lack the flexibility to express deploy primitives
#81 opened 5 months ago by kerthcet
4
Integrate with Kueue for fungibility capacity
#74 opened 5 months ago by kerthcet
1
Mount /dev/shm for shared memory files
#73 opened 5 months ago by kerthcet
1
Once name containers dot, failed to create Pods
#67 opened 5 months ago by kerthcet
1
Milestone v0.1.0
#63 opened 5 months ago by kerthcet
0
Support different GPU accelerators for fungibility
#62 opened 5 months ago by kerthcet
1