Incorrect arguments passed to instance placement scriptlet on instance move
victoitor opened this issue · 2 comments
When incus move
is used to move an instance between projects in a cluster, the arguments used to call the placement scriptlet are incorrect.
I have 3 projects with the following cluster group restrictions.
victoitor@bastion:~$ incus project get intel-12700 restricted.cluster.groups
intel-12700
victoitor@bastion:~$ incus project get amd-5700g restricted.cluster.groups
amd-5700g
victoitor@bastion:~$ incus project get auxiliar restricted.cluster.groups
amd-5700g
And the following cluster groups.
victoitor@bastion:~$ incus cluster group show amd-5700g
description: ""
members:
- amd01
- amd02
- amd03
- amd04
config: {}
name: amd-5700g
victoitor@bastion:~$ incus cluster group show intel-12700
description: ""
members:
- intel01
- intel02
- intel03
config: {}
name: intel-12700
I have a scriptlet with the following part for logging the input.
def instance_placement(request, candidate_members):
project = get_project( request.project )
log_error("SCRIPTLET DEBUG Request: ", request, "\nSCRIPTLET DEBUG Candidade members: ", candidate_members, "\nSCRIPTLET DEBUG Project: ", project)
So I create and instance on project auxiliar and use incus move
to move it between all possible pairs of projects, I get the following sequence of command and log. So the set of candidate members always includes just one node instead of the full cluster group. Furthermore, sometimes the target project is incorrect, like when moving from amd-5700g to auxiliar, which is quite awkward.
victoitor@bastion:~$ incus move incus-test --project auxiliar --target-project amd-5700g
ERROR [2024-10-09T14:34:41-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n groups: sudo, video, render\n name: pargo\n lock_passwd: true\n sudo: ALL=(ALL) NOPASSWD:ALL\n shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-5,8-13", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "create", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:c0:aa:18"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "auxiliar", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "amd-5700g"}
SCRIPTLET DEBUG Candidade members: [{"roles": [], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd02", "url": "https://10.11.16.12:8443", "database": False, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-5,8-13", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas amd-5700g", "name": "amd-5700g", "used_by": []}
victoitor@bastion:~$ incus move incus-test --project amd-5700g --target-project auxiliar
ERROR [2024-10-09T14:35:53-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n groups: sudo, video, render\n name: pargo\n lock_passwd: true\n sudo: ALL=(ALL) NOPASSWD:ALL\n shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-5,8-13", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "create", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.cloud-init.instance-id": "58659f9e-ee14-44dc-9669-4f6857d581d3", "volatile.eth0.hwaddr": "00:16:3e:c0:aa:18", "volatile.idmap.base": "0", "volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]", "volatile.last_state.idmap": "[]", "volatile.uuid": "3777865f-e0ec-4e0f-a4db-88fd54d93623", "volatile.uuid.generation": "3777865f-e0ec-4e0f-a4db-88fd54d93623"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": [], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "", "live": False, "instance_only": False, "refresh": False, "project": "", "allow_inconsistent": False}, "instance_type": "", "type": "", "start": False, "reason": "relocation", "project": "amd-5700g"}
SCRIPTLET DEBUG Candidade members: [{"roles": [], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd02", "url": "https://10.11.16.12:8443", "database": False, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-5,8-13", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas amd-5700g", "name": "amd-5700g", "used_by": []}
victoitor@bastion:~$ incus move incus-test --project auxiliar --target-project intel-12700
ERROR [2024-10-09T14:37:47-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n groups: sudo, video, render\n name: pargo\n lock_passwd: true\n sudo: ALL=(ALL) NOPASSWD:ALL\n shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-15", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:dc:e3:a7"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "auxiliar", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "intel-12700"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-15", "user.experimentos.limits.memory": "24GB"}, "groups": ["intel-12700"], "server_name": "intel01", "url": "https://10.11.16.31:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "intel-12700", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-15", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas intel-12700", "name": "intel-12700", "used_by": []}
victoitor@bastion:~$ incus move incus-test --project intel-12700 --target-project amd-5700g
ERROR [2024-10-09T14:39:27-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n groups: sudo, video, render\n name: pargo\n lock_passwd: true\n sudo: ALL=(ALL) NOPASSWD:ALL\n shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-5,8-13", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:dc:e3:a7"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "intel-12700", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "amd-5700g"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database-leader", "database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd01", "url": "https://10.11.16.11:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-5,8-13", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas amd-5700g", "name": "amd-5700g", "used_by": []}
victoitor@bastion:~$ incus move incus-test --project amd-5700g --target-project intel-12700
ERROR [2024-10-09T14:40:29-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n groups: sudo, video, render\n name: pargo\n lock_passwd: true\n sudo: ALL=(ALL) NOPASSWD:ALL\n shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-15", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:dc:e3:a7"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "amd-5700g", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "intel-12700"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-15", "user.experimentos.limits.memory": "24GB"}, "groups": ["intel-12700"], "server_name": "intel02", "url": "https://10.11.16.32:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "intel-12700", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-15", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas intel-12700", "name": "intel-12700", "used_by": []}
victoitor@bastion:~$ incus move incus-test --project intel-12700 --target-project auxiliar
ERROR [2024-10-09T14:42:03-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n groups: sudo, video, render\n name: pargo\n lock_passwd: true\n sudo: ALL=(ALL) NOPASSWD:ALL\n shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "6-7,14-15", "limits.cpu.allowance": "100%", "limits.memory": "1GiB", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.cloud-init.instance-id": "d6bc666a-e91e-48cd-a9aa-bc226762e2be", "volatile.eth0.hwaddr": "00:16:3e:df:b8:14", "volatile.idmap.base": "0", "volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]", "volatile.last_state.idmap": "[]", "volatile.uuid": "539d2bf4-1264-4e8d-a7d6-9ff58d098646", "volatile.uuid.generation": "539d2bf4-1264-4e8d-a7d6-9ff58d098646"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "intel-12700", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "auxiliar"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database-leader", "database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd01", "url": "https://10.11.16.11:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.backups": "allow", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.lowlevel": "allow", "restricted.containers.nesting": "allow", "restricted.devices.disk": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "6-7,14-15", "user.node.represented": "true"}, "description": "Montagem e estacionamento", "name": "auxiliar", "used_by": []}
Things actually seem consistent here, just not particularly ideal:
ERROR [2024-11-15T02:27:20Z] [server04] Instance placement scriptlet: [stgraber][relocation] project=restrict-s03, instance=test, candidates=["server01"]
ERROR [2024-11-15T02:27:20Z] [server04] Instance placement scriptlet: [stgraber][new] project=restrict-s01, instance=test, candidates=["server01"]
and then:
ERROR [2024-11-15T02:28:04Z] [server01] Instance placement scriptlet: [stgraber][relocation] project=restrict-s01, instance=test, candidates=["server04", "server03"]
ERROR [2024-11-15T02:28:04Z] [server01] Instance placement scriptlet: [stgraber][new] project=restrict-s03, instance=test, candidates=["server04"]
I don't know why in your case you're only seeing the new
reason and not the relocation
one.
Basically during the move, Incus uses the relocation
call to determine where the instance should be going. At that point it still exists in the source project which is why we're getting the source project at that point. The set of candidates being passed during relocation
are the allowed candidates for the target project and it's when the scriptlet can actually make a decision.
Then after that decision is made, Incus internally handles the cross-project move which effectively is a copy+delete, that's why we get the new
call into the scriptlet again, this time with the new project as target and this time with no flexibility on the target as it has already been decided.
Now ideally we'd be able to:
- Eliminate the following
new
call entirely in this scenario, finding a way to detect that this is an internal move and not a new instance being created - Alter the call for
relocation
to indicate the target project rather than source
I'll take a look into this now. The project name part should be pretty trivial, eliminating the new
event will likely be a bit trickier.