nebuly-ai/nos

Module to Automatically maximize the utilization of GPU resources in a Kubernetes cluster through real-time dynamic partitioning and elastic quotas - Effortless optimization at its finest!

GoApache-2.0

Issues

nebuly-nvidia-device plugin crash on new partitioning / config change
#57 opened 5 months ago by hasenbam
0
Demo gpu sharing for mps does not start inferencing after downloading pytorch_model.bin
#56 opened 5 months ago by ltson4121994
0
Unable to pull
#44 opened 8 months ago by 1392273211
1
KubeFlow Integration
#54 opened 9 months ago by jreuben11
0
How to configure sharing.mps for individual nodes
#53 opened 9 months ago by amouu
0
NOS MPS leaves GPUs on node in exclusive mode
#27 opened 2 years ago by Damowerko
4
Limiting GPU Resource Usage per Docker Container with MPS Daemon
#52 opened 9 months ago by valafon
0
7g.79gb does not work as expected.
#51 opened 9 months ago by houms-sony
1
nvdia-cuda-mps-server consistently hangs at the "creating worker thread" log
#49 opened a year ago by yangcheng-dev
0
Nebuly k8s-device-plugin not starting on GKE
#36 opened 2 years ago by lmyslinski
3
Multi-tenant Elastic Resource Quota
#48 opened a year ago by kaiohenricunha
0
Cannot use entire gpu memory
#47 opened a year ago by ettelr
0
Usage with Karpenter?
#46 opened a year ago by keeganmccallum
0
pod stuck pending at resource overuse
#45 opened a year ago by selinnilesy
1
Question about mps sever occupied GPU memory
#39 opened 2 years ago by Deancup
1
MPS server not serving any request after connecting with wrong user ID
#19 opened 2 years ago by Telemaco019
1
Cluster autoscaling with nos
#43 opened a year ago by ktzsh
1
Partitioner renders malformed device-plugin ConfigMap value which breaks GFD, causing Pods to be Pending forever
#41 opened a year ago by zerodayyy
0
Support running on nodes with host-installed GPU drivers
#40 opened a year ago by zerodayyy
0
Elastic Resource Quota for non-AI workloads
#37 opened 2 years ago by kaiohenricunha
2
GPU Ram limit invalid
#38 opened 2 years ago by shadowcollecter
0
MIG partitioning on multi-GPU nodes breaks when there are no MIG devices
#25 opened 2 years ago by Telemaco019
0
doc: wrong make targets
#32 opened 2 years ago by WindowsXp-Beta
3
wrong resource file name
#31 opened 2 years ago by WindowsXp-Beta
1
typo: redundant yaml
#29 opened 2 years ago by WindowsXp-Beta
2
mig-agent pod failure
#21 opened 2 years ago by likku123
19
Support mixed MIG+MPS dynamic partititioning
#28 opened 2 years ago by Telemaco019
0
Handle GPU partitioning mode changes on the same Node (MIG<>MPS)
#16 opened 2 years ago by Telemaco019
0
GPU Partitioning annotations are not properly cleaned up
#26 opened 2 years ago by Telemaco019
0
Metrics-exporter setup; How to go about it?
#24 opened 2 years ago by suchisur
1
Documentation is not clear when it comes to cuda requirements
#22 opened 2 years ago by 5cat
3
resource request key format
#20 opened 2 years ago by 5cat
2