Rabbit DirectiveBreakdown AllocateSingleServer not working
matthew-richerson opened this issue · 0 comments
Trying to create a standalone MGS through flux doesn't work. The "servers" resource is filled in with an empty "storage" field.
A standalone MGS can be created with "standaloneMgtPoolName" set to the pool name in the NnfStorageProfile. When a "#dw create_persistent type=lustre ..." workflow is created, a DirectiveBreakdown is made with a single allocation set asking for space for the MGT.
NnfStorageProfile:
...
data:
default: false
lustreStorage:
capacityMgt: 5GiB
combinedMgtMdt: false
exclusiveMdt: false
standaloneMgtPoolName: test-pool
...
DirectiveBreakdown:
...
status:
ready: true
storage:
allocationSets:
- allocationStrategy: AllocateSingleServer
constraints:
colocation:
- key: lustre-mgt
type: exclusive
count: 1
labels:
- dataworkflowservices.github.io/storage=Rabbit
label: mgt
minimumCapacity: 5368709120
Flux fills in the "Servers" resource as follows:
"spec": {
"allocationSets": [
{
"allocationSize": 5368709120,
"label": "mgt",
"storage": []
}
]
},
From looking at the DWS code, it looks like the AllocateSingleServer case is not handled here: https://github.com/flux-framework/flux-coral2/blob/master/src/python/flux_k8s/directivebreakdown.py#L34 resulting in the empty "Storage" field.
I haven't tested it, but this would probably also cause a problem for a Lustre file system that allocated an MGT as part of the job (i.e., ExternalMgs not set).