Issues
- 4
- 2
fail to get response from manager, error rpc error: code = Unknown desc = can't find kubepods-besteffort-pod6f1c7606_fb47_4c34_82aa_9b9966435a65.slice from docker
#184 opened by hyc-yuchen - 2
gaiaGPU对k8s 1.21版本兼容问题
#196 opened by ranxuxin001 - 2
- 4
Is this project still under maintanace ?
#186 opened by panpan0000 - 1
Readme范例说明和代码不一致
#181 opened by lijihong0723 - 3
Allocate failed due to rpc error: code = Unknown desc = no free node, which is unexpected
#191 opened by mingkai-yang - 3
- 1
所有有安装完成了,测试程序也安装完成了,执行测试脚本时提示bash: /tmp/cuda-control/src/loader.c: No such file or directory
#194 opened by 1hanpengfei - 4
- 9
挺厉害的一个开源项目,但是感觉项目已经没人维护了。
#161 opened by ssslkj123 - 2
Pod ignores limits.
#180 opened by valafon - 5
make img failed
#174 opened by yeqiugt - 6
- 0
在共享模式下 无法生成tensorrt
#189 opened by vicmeng - 0
/tmp/cuda-control/src/hijack_call.c:471 cuInit error no CUDA-capable device is detected
#188 opened by vicmeng - 0
模型训练报错 CUDNN_STATUS_NOT_INITIALIZED
#187 opened by vicmeng - 5
- 10
程序正常跑起来了, 但是跑测试容器的时候, 在容器里面执行 nvidia-smi 有有以下报错
#147 opened by khw934 - 0
cgroup 读取进程列表为空,请帮忙看看
#185 opened by wangkeya - 1
pod状态UnexpectedAdmissionError
#182 opened by ledrsnet - 1
Is there to provide a grafana dashboard?
#156 opened by shenchucheng - 0
- 3
关于11.4的劫持问题
#139 opened by M201972777 - 3
python code will be killed when call GPU resource
#172 opened by mikev4 - 0
How to determine primary GPU?
#171 opened by Ashark - 0
The initialization time is too long during mnist test
#170 opened by Natelu - 13
Segmentation fault with cuda 11.3
#169 opened by hzliangbin - 1
seems not working on newest A100 80G
#143 opened by clennpillo - 17
/tmp/cuda-control/src/loader.c:865 can't find library libcuda.so by use image thomassong/gpu-manager:master
#150 opened by syy532960478 - 1
cuda新版本支持情况
#153 opened by thirdcountry - 2
use gpu-manager in cuda drvier11.6 , Function Not Found in Memory-Usage when use nvidia-smi in container
#159 opened by WindyLQL - 1
gpu-manager run failed
#168 opened by leofang94 - 0
"GPU 拓扑分配功能"是优先选择NVLINK方式吗?
#167 opened by wenlong92 - 8
- 0
Linux使用rpm包安装nvidia显卡驱动启动失败
#165 opened by ly18047121713 - 0
nvidia-smi block when gpu-manager unhealthy
#162 opened by WulixuanS - 1
- 0
这个镜像无法下载了 tensorflow-gputest:0.2 , 是否可以重新上传下
#155 opened by khw934 - 0
算力超售
#154 opened by geekidentity - 0
这个支持 NVIDIA TESLA K20 和 K40 显卡的虚拟化 ??
#152 opened by khw934 - 1
服务无法启动,中途会退出
#146 opened by khw934 - 3
请问去掉manager和scheduler选择的节点是否一致的校验有没有风险?
#142 opened by jdhxf - 1
kubelet_internal_checkpoint not exist
#148 opened by xing0821 - 0
1.8.1 创建pod的失败,No such file or directory: unknown
#149 opened by FontTian - 0
Service startup error is reported, and the log is as follows ,Unable to set Type=notify in systemd service file
#145 opened by khw934 - 0
Geforce RTX 3090 multiple pods schedule problem
#141 opened by dwbxm - 5
- 2
GeForce RTX 3090 GPU Pod created ,output error
#138 opened by dwbxm - 3
The GPU ID in the gpu-admission and gpu-manager sorting functions is inconsistent
#136 opened by fighterhit