kserve/kserve

Standardized Serverless ML Inference Platform on Kubernetes

PythonApache-2.0

Issues

ClusterRole permissions are too broadly scoped?
#3724 opened 5 days ago by jssnirmal
0
Support text embedding task in huggingface server
#3572 opened 2 months ago by kevinmingtarja
3
`NCCL` and `flash_atten` packages are missing in huggingface runtime
#3722 opened 6 days ago by sivanantha321
0
Autoscaling with multiple metrics does not work
#3638 opened a month ago by shazinahmed
3
Release 0.13 Tracking
#3648 opened 7 days ago by yuzisun
3
Add option to skip postprocess in huggingface server
#3570 opened 7 days ago by kevinmingtarja
1
Make MAX_GRPC_MESSAGE_LENGTH Configurable for Image Input Size Flexibility
#3717 opened 10 days ago by anencore94
0
Runtime Specific Metrics via `ServiceMonitor` (with qpext)
#3696 opened 20 days ago by robertgshaw2-neuralmagic
7
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
#3662 opened a month ago by serdarildercaglar
12
Inference gRPC/Rest client to support FP16
#3643 opened 13 days ago by yuzisun
0
Unable to run InferenceService on a local cluster
#3689 opened 14 days ago by yurkoff-mv
12
Is there a way to supply a token to the hugging face inference server run time?
#3693 opened 25 days ago by empath-nirvana
1
Create github action to upload release yaml artifacts automatically
#3714 opened 14 days ago by yuzisun
0
InferenceService Model Transition in Pending/InProgress forever while inference service is operational
#3686 opened a month ago by CanmingCobble
3
Make label and annotation propagation configurable
#3710 opened 19 days ago by cmaddalozzo
1
Update to AWS Go SDK v2
#3709 opened 19 days ago by mattjohnsonpint
0
Support ollama server
#3647 opened a month ago by skonto
2
ResponseStartTimeoutSeconds does not honor TimeoutSeconds specified by user
#3706 opened 20 days ago by yuzisun
0
Discuss the future of models-webapp
#3625 opened 2 months ago by rimolive
8
Merge responses from InferenceGraph Sequence node steps
#3639 opened a month ago by asd981256
2
Allow PVC Model Mount in ReadWrite Mode
#3687 opened a month ago by supertetelman
12
Download files from Azure storage under virtual directory for Multi-model serving
#3691 opened a month ago by leduckhc
0
Update image container image but the associated predictor pod image is not change
#3683 opened a month ago by kwjerrychan
1
model with name <inference service name> does not exist.
#3682 opened a month ago by VikasAbhishek
8
Scale pods based on a cron schedule
#3597 opened 2 months ago by vukor
0
Add vLLM backend e2e test
#3644 opened a month ago by yuzisun
1
Allow remote code execution on huggingfaceserver
#3580 opened 2 months ago by isaranto
2
Allow re-running of failed workflows
#3631 opened a month ago by andyi2it
0
Support suspending InferenceService
#3675 opened a month ago by tenzen-y
1
protocolVersion used by the predictor
#3674 opened a month ago by Csehpi
0
Model Storage For Multi-Node Scaling of Large Models
#3663 opened a month ago by robertgshaw2-neuralmagic
5
AttributeError: 'Deployment' object has no attribute 'deploy'
#3661 opened a month ago by SagyHarpazGong
3
Update deprecated generate-groups.sh to kube_codegen
#3667 opened a month ago by spolti
1
Any specific optimization did in kserve to support LLM inference?
#3623 opened 2 months ago by Jeffwan
2
Add a provision in KServe repo to allow cherry pick of PRs to release branches
#3656 opened a month ago by andyi2it
0
Kserve deployment certificate issue - tls: failed to verify certificate: x509: certificate signed by unknown authority\nError from server (InternalError)
#3649 opened a month ago by Subhankar-Adak
3
Add Modelcars as initContainer with restartPolicy == Always (optional)
#3646 opened a month ago by rhuss
0
Support overriding model mount path in model server container
#3606 opened 2 months ago by cmaddalozzo
3
logger not surfacing the error when failed to send cloud event
#3637 opened a month ago by yuzisun
0
Record ownerReferences on managed ingress
#3636 opened a month ago by backjo
0
Add metadata to logger system
#3634 opened a month ago by gcemaj
0
Upgrade from v0.10 to v0.11 breaks the InferenceService at routing level
#3611 opened 2 months ago by bgalvao
1
Custom ClusterServingRuntime not being selected based on modelFormat Name/Version without runtime specification
#3632 opened 2 months ago by supertetelman
2
Parallel Model Inference with Ray Serving not working due to deprecated API
#3595 opened 2 months ago by ajstewart
3
Update exclusion list for go lint and Fix Go lint errors
#3608 opened 2 months ago by andyi2it
0
Fix new Golang security vulnerability - CVE-2024-24786
#3602 opened 2 months ago by andyi2it
0
Old Revisions of Inference Service not Scaled Down
#3591 opened 2 months ago by ksgnextuple
1
Add parameter in ModelMetadataResponse in v2 (aka open inference) protocol
#3574 opened 2 months ago by harshita-meena
0
Go Coverage action is failing in CI
#3563 opened 2 months ago by sivanantha321
2
Use black to auto-format Python code
#3567 opened 2 months ago by cmaddalozzo
0