AliyunContainerService/gpushare-scheduler-extender
GPU Sharing Scheduler for Kubernetes Cluster
GoApache-2.0
Issues
- 0
申请了gpu的pod在运行一段时间之后 容器内会出现找不到显卡
#230 opened by wanghaowish - 3
- 0
方案只能在阿里云上的机器里使用吗
#222 opened by wolgod - 0
节点上有多个GPU时,无法正常分配GPU
#228 opened by hotbaby - 0
运行了一年后,创建新的 pod 报错 failed bind with extender at URL http://127.0.0.1:32766/gpushare-scheduler/bind, code 500
#229 opened by klvchen - 1
[AKS] kube-scheduler static POD not running for Aliyun GPU Scheduler Extender
#224 opened by dsatizabal - 0
副本问题
#227 opened by AndrewOYLK - 1
k8s上安装好插件,无法识别到集群GPU资源
#226 opened by ferris-cx - 0
- 0
kubelet版本問題
#223 opened by longcheung123 - 0
ALIYUN_COM_GPU_MEM_IDX in the annotation is different than ALIYUN_COM_GPU_MEM_IDX inside the pod
#220 opened by wokalski - 1
- 0
这个项目目前在使用过程中存在的问题
#219 opened by freelizhun - 0
调度层有bug吧,请求8G,实际设备最大7G,居然最终能创建成功pod
#218 opened by hiahia121 - 0
关于显存申请基本单位改为MiB但不起作用的问题
#217 opened by harrymore - 0
- 1
Support for Horizontal Pod Autoscaling (HPA) with GPU Pods? 是否支持使用GPU Pods的水平Pod自动扩展(HPA)?
#215 opened by tobq - 0
如果一个机器上有两张卡,第一张卡的内存使满了,之后的任务会调度到另一张卡上吗
#213 opened by vicmeng - 1
如果想要指定使用两张显卡多卡训练 该怎么做
#212 opened by vicmeng - 4
这个GPU共享插件支持使用dcgm-exporter做监控吗
#211 opened by db-root - 7
gpushare-schd-extender in Pending State
#170 opened by m1nish1208 - 1
- 3
使用kubeflow1.6.1 使用自定义镜像有问题
#199 opened by 631068264 - 1
- 4
Wrong GPU ID
#191 opened by tintranvan - 0
你好,kubectl logs这个命令在gpu容器上无效,在普通容器上却可以
#209 opened by 140ai - 2
Not able to use gpushare-scheduler-extender on EKS cluster with Kubernetes v1.24
#205 opened by suchisur - 0
GPU cores scheduling / GPU核心调度
#208 opened by valafon - 1
- 0
多次进行删除创建Pod之后,会导致新创建Pod出现Pending状态
#201 opened by liufangpeng - 0
scheduler-policy-config.yaml文件咨询
#200 opened by liufangpeng - 0
- 1
k3s services not started scheduler exited: stat /etc/kubernetes/scheduler.conf: no such file or directory
#196 opened by RotemAmergi - 1
How to share arithmetical force of a gpu?
#190 opened by joeevon - 0
- 0
trivy image scan lists critical and high vulnerability against latest image k8s-gpushare-schd-extender:1.11-d170d8a
#189 opened by carlwang87 - 6
gpushare-device-plugin pod fails to start
#173 opened by southquist - 0
- 0
pod运行完成后,插件更新gpu池不及时。当有多个pending的pod排队分配资源时,最后一个pod会一直等到flushUnschedulablePodsLeftover才会重新分配资源
#187 opened by huiyangz - 2
gpushare能统计到两块显卡,但是分配的时候只用到了一块显卡,另外一块无法调用到
#177 opened by 1003111014 - 1
单机双显卡时,调度器显示绑定到了不同的显卡上,实际全部都调度到了一张显卡上
#184 opened by 1003111014 - 0
Any instruction/template to help define customized GPU scheduler policy?
#182 opened by blackjack2015 - 0
Device list strategy - mounts
#181 opened by xhejtman - 0
Microk8s installation instructions
#180 opened by agnoam - 0
如何使用ALIYUN_COM_GPU_SPECIAL_IDX指定主机运行,我使用主机名不生效
#179 opened by 1003111014 - 0
gpushare能统计到两块显卡,但是分配的时候只用到了一块显卡,另外一块无法调用到
#178 opened by 1003111014 - 1
- 1
not able to find dev with index
#175 opened by southquist - 3
how to share multiple gpus?
#174 opened by mengwanguc - 5
Pods FailedScheduling with Post "http://127.0.0.1:32766/gpushare-scheduler/filter": EOF
#171 opened by noranraskin