gpu-operator creates ci using mig Insufficient Resources
asskss opened this issue · 3 comments
mig-config.yaml
mig-configs:
custom-config:
- devices: [0]
mig-enabled: false
- devices: [1]
mig-enabled: true
mig-devices:
"7g.80gb": 1
- devices: [2]
mig-enabled: true
mig-devices:
"2g.20gb": 3
- devices: [3]
mig-enabled: true
mig-devices:
"3g.40gb": 1
"4g.40gb": 1
- devices: [4]
mig-enabled: true
mig-devices:
"3g.40gb": 1
"4g.40gb": 1
- devices: [5]
mig-enabled: true
mig-devices:
"3g.40gb": 1
"4g.40gb": 1
- devices: [6]
mig-enabled: true
mig-devices:
"3g.40gb": 1
"4g.40gb": 1
- devices: [7]
mig-enabled: true
mig-devices:
"1g.10gb": 1
"2g.20gb": 1
"4g.40gb": 1
nvidia-smi mig -lcip -gi 0
+--------------------------------------------------------------------------------------+
| Compute instance profiles: |
| GPU GPU Name Profile Instances Exclusive Shared |
| Instance ID Free/Total SM DEC ENC OFA |
| ID CE JPEG |
|======================================================================================|
| 0 0 MIG 1c.7g.80gb 0 0/7 14 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 0 0 MIG 2c.7g.80gb 1 0/3 28 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 0 0 MIG 3c.7g.80gb 2 0/2 42 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 0 0 MIG 4c.7g.80gb 3 0/1 56 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 0 0 MIG 7g.80gb 4* 0/1 98 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 1 0 MIG 1c.7g.80gb 0 0/7 14 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 1 0 MIG 2c.7g.80gb 1 0/3 28 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 1 0 MIG 3c.7g.80gb 2 0/2 42 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 1 0 MIG 4c.7g.80gb 3 0/1 56 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
| 1 0 MIG 7g.80gb 4* 0/1 98 5 0 1 |
| 7 1 |
+--------------------------------------------------------------------------------------+
nvidia-smi mig -lci -gi 0
+--------------------------------------------------------------------+
| Compute instances: |
| GPU GPU Name Profile Instance Placement |
| Instance ID ID Start:Size |
| ID |
|====================================================================|
| 0 0 MIG 7g.80gb 4 0 0:7 |
+--------------------------------------------------------------------+
| 1 0 MIG 7g.80gb 4 0 0:7 |
+--------------------------------------------------------------------+
nvidia-smi mig -cci 2 -gi 0
Unable to create a compute instance on GPU 0 GPU instance ID 0 using profile 2: Insufficient Resources
Failed to create compute instances: Insufficient Resources
I want to create a MIG 3c.7g/80gb specification prompt for Independent Resources.How to solve it.
Note that as shown, both GPUs already have [7c.]7g.80gb
partitions created on them. This means that the ADDITIONAL 3c.7g.80gb
partition cannot be created.
Note that since you mention the GPU Operator, the use of partitions where c != g
are not currently supported there, but they should be in mig-parted
.
Could you give more details on your use case?
Note that as shown, both GPUs already have
[7c.]7g.80gb
partitions created on them. This means that the ADDITIONAL3c.7g.80gb
partition cannot be created.Note that since you mention the GPU Operator, the use of partitions where
c != g
are not currently supported there, but they should be inmig-parted
.Could you give more details on your use case?
Thanks, my usage scenario is K8S.
- My needs: I hope that the two models can be deployed mixedly through 4c.7g.80gb and 3c.7g.80gb. Is this mode shared video memory?
- If two models are deployed on one card, are they scheduled fairly or preempted freely when using CUDA?
https://forums.developer.nvidia.com/t/error-creating-cis-with-mig-on-nvidia-a30/241955
I also saw this article, not sure if it's relevant.