LINBIT/addon-linstor

Live Migration failing

ulide4 opened this issue · 11 comments

Hi again,

live migration between two OpenNebula KVM hosts is failing.
Addon-On Version 0.92
linstor-controller version 0.7.3
OpenNebula 5.6.1
OS: Ubuntu 18.04
Datastore: Autoplace, 2 Copies, clonetype: copy

Error from OpenNebula Log:
Wed Jan 30 08:09:43 2019 [Z0][VM][I]: New LCM state is MIGRATE
Wed Jan 30 08:09:50 2019 [Z0][VMM][I]: Successfully execute transfer manager driver operation: tm_premigrate.
Wed Jan 30 08:09:52 2019 [Z0][VMM][I]: ExitCode: 0
Wed Jan 30 08:09:52 2019 [Z0][VMM][I]: Successfully execute network driver operation: pre.
Wed Jan 30 08:09:55 2019 [Z0][VMM][I]: Command execution fail: cat << EOT | /var/tmp/one/vmm/kvm/migrate 'one-255' 'one2' 'one1' 255 one1
Wed Jan 30 08:09:55 2019 [Z0][VMM][E]: migrate: Command "virsh --connect qemu:///system migrate --live one-255 qemu+ssh://one2/system" failed: error: Cannot access storage file '/var/lib/one//datastores/104/255/disk.1': No such file or directory
Wed Jan 30 08:09:55 2019 [Z0][VMM][E]: Could not migrate one-255 to one2
Wed Jan 30 08:09:55 2019 [Z0][VMM][I]: ExitCode: 1

At the same time the linstor-controller shows the following error:
LINSTOR ==> error-reports show 5C49A05B-00000-000022
ERROR REPORT 5C49A05B-00000-000022

============================================================

Application: LINBIT® LINSTOR
Module: Controller
Version: 0.7.5
Build ID: d74305b420fdc878182afa162378a317e6a4a3b9
Build time: 2018-12-21T09:05:14+00:00
Error time: 2019-01-30 08:09:47
Node: sun1
Peer: 0:0:0:0:0:0:0:1:55948

============================================================

Reported error:

Category: Exception
Class name: InvalidNameException
Class canonical name: com.linbit.InvalidNameException
Generated at: Method 'nameCheck', Source file 'Checks.java', Line #71

Error message: Invalid name: Name length 0 is less than minimum length 2

Error context:
The specified resource name '' is invalid.

Call backtrace:

Method                                   Native Class:Line number
nameCheck                                N      com.linbit.Checks:71
<init>                                   N      com.linbit.linstor.ResourceName:33
asRscName                                N      com.linbit.linstor.LinstorParsingUtils:127
createRscDfn                             N      com.linbit.linstor.core.apicallhandler.controller.CtrlRscDfnApiCallHandler:383
createResourceDefinition                 N      com.linbit.linstor.core.apicallhandler.controller.CtrlRscDfnApiCallHandler:136
createResourceDefinition                 N      com.linbit.linstor.core.apicallhandler.controller.CtrlApiCallHandler:248
execute                                  N      com.linbit.linstor.api.protobuf.controller.CreateResourceDefinition:54
executeNonReactive                       N      com.linbit.linstor.proto.CommonMessageProcessor:520
lambda$execute$13                        N      com.linbit.linstor.proto.CommonMessageProcessor:496
doInScope                                N      com.linbit.linstor.core.apicallhandler.ScopeRunner:141
lambda$fluxInScope$0                     N      com.linbit.linstor.core.apicallhandler.ScopeRunner:71
call                                     N      reactor.core.publisher.MonoCallable:92
trySubscribeScalarMap                    N      reactor.core.publisher.FluxFlatMap:126
subscribe                                N      reactor.core.publisher.MonoFlatMapMany:46
subscribe                                N      reactor.core.publisher.Flux:6877
onNext                                   N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:184
request                                  N      reactor.core.publisher.Operators$ScalarSubscription:1640
onSubscribe                              N      reactor.core.publisher.MonoFlatMapMany$FlatMapManyMain:131
subscribe                                N      reactor.core.publisher.MonoCurrentContext:35
subscribe                                N      reactor.core.publisher.MonoFlatMapMany:49
subscribe                                N      reactor.core.publisher.FluxOnAssembly:252
subscribe                                N      reactor.core.publisher.FluxOnAssembly:252
subscribe                                N      reactor.core.publisher.FluxOnErrorResume:47
subscribe                                N      reactor.core.publisher.FluxOnErrorResume:47
subscribe                                N      reactor.core.publisher.FluxOnErrorResume:47
subscribe                                N      reactor.core.publisher.FluxOnErrorResume:47
subscribe                                N      reactor.core.publisher.FluxContextStart:49
subscribe                                N      reactor.core.publisher.Flux:6877
onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:372
slowPath                                 N      reactor.core.publisher.FluxArray$ArraySubscription:126
request                                  N      reactor.core.publisher.FluxArray$ArraySubscription:99
onSubscribe                              N      reactor.core.publisher.FluxFlatMap$FlatMapMain:332
subscribe                                N      reactor.core.publisher.FluxMerge:70
subscribe                                N      reactor.core.publisher.FluxOnErrorResume:47
subscribe                                N      reactor.core.publisher.Flux:6877
onComplete                               N      reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber:208
subscribe                                N      reactor.core.publisher.FluxConcatArray:81
subscribe                                N      reactor.core.publisher.FluxPeek:83
subscribe                                N      reactor.core.publisher.FluxPeek:83
subscribe                                N      reactor.core.publisher.FluxPeek:83
subscribe                                N      reactor.core.publisher.FluxOnErrorResume:47
subscribe                                N      reactor.core.publisher.FluxDefer:55
subscribe                                N      reactor.core.publisher.Flux:6877
onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:372
drainAsync                               N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:391
drain                                    N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:633
onNext                                   N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:238
drainFused                               N      reactor.core.publisher.UnicastProcessor:234
drain                                    N      reactor.core.publisher.UnicastProcessor:267
onNext                                   N      reactor.core.publisher.UnicastProcessor:343
next                                     N      reactor.core.publisher.FluxCreate$IgnoreSink:573
next                                     N      reactor.core.publisher.FluxCreate$SerializedSink:151
processInOrder                           N      com.linbit.linstor.netcom.TcpConnectorPeer:361
doProcessMessage                         N      com.linbit.linstor.proto.CommonMessageProcessor:215
lambda$processMessage$2                  N      com.linbit.linstor.proto.CommonMessageProcessor:161
onNext                                   N      reactor.core.publisher.FluxPeek$PeekSubscriber:177
runAsync                                 N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:396
run                                      N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:480
call                                     N      reactor.core.scheduler.WorkerTask:84
call                                     N      reactor.core.scheduler.WorkerTask:37
run                                      N      java.util.concurrent.FutureTask:264
run                                      N      java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask:304
runWorker                                N      java.util.concurrent.ThreadPoolExecutor:1135
run                                      N      java.util.concurrent.ThreadPoolExecutor$Worker:635
run                                      N      java.lang.Thread:844

END OF ERROR REPORT.

The live migration has worked before using an older version of the add-on.

Best Regards
Uli

rp- commented

Could you also provide the output of journalctl -u opennebula of the failing premigrate? or anything that look suspicious around that time?
Thanks!

There you go:
an 30 08:09:44 sun1 premigrate[31685]: INFO Entering tm/premigrate from one1 to one2 in /var/lib/one//datastores/104/255
Jan 30 08:09:44 sun1 premigrate[31685]: INFO running shell command: bash -c source /var/lib/one/remotes//tm/tm_common.sh && arg_host one2
Jan 30 08:09:44 sun1 premigrate[31685]: INFO running shell command: bash -c source /var/lib/one/remotes//tm/tm_common.sh && arg_host one1
Jan 30 08:09:44 sun1 premigrate[31685]: INFO running shell command: bash -c source /var/lib/one/remotes//tm/tm_common.sh && arg_path /var/lib/one//datastores/
Jan 30 08:09:44 sun1 premigrate[31685]: INFO running shell command: onedatastore show --xml 104
Jan 30 08:09:49 sun1 premigrate[31685]: Traceback (most recent call last):
File "/var/lib/one/remotes/tm/ssh/../linstor/premigrate", line 137, in
main()
File "/var/lib/one/remotes/tm/ssh/../linstor/premigrate", line 52, in main
res.activate(DST_HOST)
File "/usr/lib/python2.7/dist-packages/linstor/resource.py", line 185, in wrapper
self._maybe_create_rd()
File "/usr/lib/python2.7/dist-packages/linstor/resource.py", line 227, in _maybe_create_rd
.format(self._name, rs[0]))
LinstorError: Error: Could not create resource definition : The specified resource name '' is invalid.
Jan 30 08:14:26 sun1 monitor[31858]: INFO Entering datastore monitor.

rp- commented

Seems the SOURCE isn't set for the disk in the template, not sure how this happens.
Could you also show me the output of onevm show --xml 255

source seems to be set to "OpenNebula-Image-104"...

<VM> <ID>255</ID> <UID>3</UID> <GID>102</GID> <UNAME>su_ad_prod</UNAME> <GNAME>manage_ad_prod</GNAME> <NAME>prom0</NAME> <PERMISSIONS> <OWNER_U>0</OWNER_U> <OWNER_M>0</OWNER_M> <OWNER_A>0</OWNER_A> <GROUP_U>1</GROUP_U> <GROUP_M>1</GROUP_M> <GROUP_A>0</GROUP_A> <OTHER_U>0</OTHER_U> <OTHER_M>0</OTHER_M> <OTHER_A>0</OTHER_A> </PERMISSIONS> <LAST_POLL>1548846264</LAST_POLL> <STATE>3</STATE> <LCM_STATE>3</LCM_STATE> <PREV_STATE>3</PREV_STATE> <PREV_LCM_STATE>3</PREV_LCM_STATE> <RESCHED>0</RESCHED> <STIME>1548831404</STIME> <ETIME>0</ETIME> <DEPLOY_ID>one-255</DEPLOY_ID> <MONITORING> <CPU><![CDATA[15.05]]></CPU> <DISKRDBYTES><![CDATA[535368496]]></DISKRDBYTES> <DISKRDIOPS><![CDATA[14582]]></DISKRDIOPS> <DISKWRBYTES><![CDATA[504931328]]></DISKWRBYTES> <DISKWRIOPS><![CDATA[5496]]></DISKWRIOPS> <MEMORY><![CDATA[1099652]]></MEMORY> <NETRX><![CDATA[43342499]]></NETRX> <NETTX><![CDATA[573370]]></NETTX> <STATE><![CDATA[a]]></STATE> </MONITORING> <TEMPLATE> <AUTOMATIC_DS_REQUIREMENTS><![CDATA[("CLUSTERS/ID" @> 100)]]></AUTOMATIC_DS_REQUIREMENTS> <AUTOMATIC_REQUIREMENTS><![CDATA[(CLUSTER_ID = 100) & !(PUBLIC_CLOUD = YES)]]></AUTOMATIC_REQUIREMENTS> <CONTEXT> <ANSIBLE_HOST><![CDATA[YES]]></ANSIBLE_HOST> <ANSIBLE_MANAGED><![CDATA[YES]]></ANSIBLE_MANAGED> <ANSIBLE_ONE_ALIAS><![CDATA[observeiaasfoss]]></ANSIBLE_ONE_ALIAS> <DESCRIPTION><![CDATA[]]></DESCRIPTION> <DISK_ID><![CDATA[2]]></DISK_ID> <DOMAIN><![CDATA[.on.ps.loc]]></DOMAIN> <ETH0_CONTEXT_FORCE_IPV4><![CDATA[]]></ETH0_CONTEXT_FORCE_IPV4> <ETH0_DNS><![CDATA[192.168.1.122 192.168.1.121]]></ETH0_DNS> <ETH0_GATEWAY><![CDATA[10.16.68.1]]></ETH0_GATEWAY> <ETH0_GATEWAY6><![CDATA[]]></ETH0_GATEWAY6> <ETH0_IP><![CDATA[10.16.68.14]]></ETH0_IP> <ETH0_IP6><![CDATA[]]></ETH0_IP6> <ETH0_IP6_PREFIX_LENGTH><![CDATA[]]></ETH0_IP6_PREFIX_LENGTH> <ETH0_IP6_ULA><![CDATA[]]></ETH0_IP6_ULA> <ETH0_MAC><![CDATA[02:00:0a:10:44:0e]]></ETH0_MAC> <ETH0_MASK><![CDATA[255.255.255.0]]></ETH0_MASK> <ETH0_MTU><![CDATA[1450]]></ETH0_MTU> <ETH0_NETWORK><![CDATA[10.16.68.0]]></ETH0_NETWORK> <ETH0_SEARCH_DOMAIN><![CDATA[]]></ETH0_SEARCH_DOMAIN> <ETH0_VLAN_ID><![CDATA[5]]></ETH0_VLAN_ID> <ETH0_VROUTER_IP><![CDATA[]]></ETH0_VROUTER_IP> <ETH0_VROUTER_IP6><![CDATA[]]></ETH0_VROUTER_IP6> <ETH0_VROUTER_MANAGEMENT><![CDATA[]]></ETH0_VROUTER_MANAGEMENT> <NETWORK><![CDATA[YES]]></NETWORK> <ONEGATE_ENDPOINT><![CDATA[http://192.168.1.3:5030]]></ONEGATE_ENDPOINT> <REPORT_READY><![CDATA[YES]]></REPORT_READY> <SET_HOSTNAME><![CDATA[srv255.on.ps.loc]]></SET_HOSTNAME> <SSH_PUBLIC_KEY><![CDATA[ssh-rsa AAAAB3NzaC1yc2EBBBBBQABAAABAQDo13LpJKiqpAEhQ9x9UVwWOwaXZca/YIZqZe1ciEeDCtx0n5dUDBbVhJ5c0B2DG92N0KOIxydIYyl4VlbNsUMcspbP6BR9OswFWzX3MbsxDhDSmNFTBeCAE29FkOvib7a8hyQj2DPApbkysqeCDTOT1wUdYmnniPfuoMZLGns1iP+kdsMP99dGsXElwrHPfnJEKwYxUB+xt37Yam9m9qhdqPd5VyYcRQa13A7dk8ZBbXkkIKA0+quJKXNWDypUtWNhUvTNw5AyGQWtjBXalwV+nageVaHFQDgfUx+kx+nnlKay7iJ8AzXFAMW2Ir74GucM8ZzXmFKlJkTyFSBjhiyh]]></SSH_PUBLIC_KEY> <TARGET><![CDATA[hda]]></TARGET> <TOKEN><![CDATA[YES]]></TOKEN> <USERNAME><![CDATA[ansible]]></USERNAME> <VMID><![CDATA[255]]></VMID> </CONTEXT> <CPU><![CDATA[0.2]]></CPU> <CREATED_BY><![CDATA[3]]></CREATED_BY> <DISK> <ALLOW_ORPHANS><![CDATA[YES]]></ALLOW_ORPHANS> <CLONE><![CDATA[YES]]></CLONE> <CLONE_TARGET><![CDATA[SELF]]></CLONE_TARGET> <CLUSTER_ID><![CDATA[100]]></CLUSTER_ID> <DATASTORE><![CDATA[local-image_ds_linstor_a2copy]]></DATASTORE> <DATASTORE_ID><![CDATA[113]]></DATASTORE_ID> <DEV_PREFIX><![CDATA[sd]]></DEV_PREFIX> <DISCARD><![CDATA[unmap]]></DISCARD> <DISK_ID><![CDATA[0]]></DISK_ID> <DISK_SNAPSHOT_TOTAL_SIZE><![CDATA[0]]></DISK_SNAPSHOT_TOTAL_SIZE> <DISK_TYPE><![CDATA[BLOCK]]></DISK_TYPE> <DRIVER><![CDATA[qcow2]]></DRIVER> <IMAGE><![CDATA[ubuntu18_u1901_linauto2copy]]></IMAGE> <IMAGE_ID><![CDATA[104]]></IMAGE_ID> <IMAGE_STATE><![CDATA[2]]></IMAGE_STATE> <IMAGE_UNAME><![CDATA[oneadmin]]></IMAGE_UNAME> <IO><![CDATA[threads]]></IO> <LN_TARGET><![CDATA[NONE]]></LN_TARGET> <ORDER><![CDATA[1]]></ORDER> <ORIGINAL_SIZE><![CDATA[20480]]></ORIGINAL_SIZE> <READONLY><![CDATA[NO]]></READONLY> <SAVE><![CDATA[NO]]></SAVE> <SIZE><![CDATA[20480]]></SIZE> <SOURCE><![CDATA[OpenNebula-Image-104]]></SOURCE> <TARGET><![CDATA[sda]]></TARGET> <TM_MAD><![CDATA[linstor]]></TM_MAD> <TYPE><![CDATA[BLOCK]]></TYPE> </DISK> <DISK> <ALLOW_ORPHANS><![CDATA[NO]]></ALLOW_ORPHANS> <CLUSTER_ID><![CDATA[100]]></CLUSTER_ID> <DATASTORE><![CDATA[local-system_ds_ssh_u]]></DATASTORE> <DATASTORE_ID><![CDATA[104]]></DATASTORE_ID> <DEV_PREFIX><![CDATA[hd]]></DEV_PREFIX> <DISK_ID><![CDATA[1]]></DISK_ID> <DISK_TYPE><![CDATA[FILE]]></DISK_TYPE> <FORMAT><![CDATA[raw]]></FORMAT> <SIZE><![CDATA[1024]]></SIZE> <TARGET><![CDATA[hdb]]></TARGET> <TM_MAD><![CDATA[ssh]]></TM_MAD> <TYPE><![CDATA[swap]]></TYPE> </DISK> <FEATURES> <GUEST_AGENT><![CDATA[yes]]></GUEST_AGENT> <VIRTIO_SCSI_QUEUES><![CDATA[1]]></VIRTIO_SCSI_QUEUES> </FEATURES> <GRAPHICS> <LISTEN><![CDATA[0.0.0.0]]></LISTEN> <PORT><![CDATA[6156]]></PORT> <TYPE><![CDATA[VNC]]></TYPE> </GRAPHICS> <INPUT> <BUS><![CDATA[usb]]></BUS> <TYPE><![CDATA[tablet]]></TYPE> </INPUT> <MEMORY><![CDATA[1024]]></MEMORY> <NIC> <AR_ID><![CDATA[0]]></AR_ID> <BRIDGE><![CDATA[onebr3]]></BRIDGE> <CLUSTER_ID><![CDATA[100]]></CLUSTER_ID> <IP><![CDATA[10.16.68.14]]></IP> <MAC><![CDATA[02:00:0a:10:44:0e]]></MAC> <MODEL><![CDATA[virtio]]></MODEL> <MTU><![CDATA[1500]]></MTU> <NETWORK><![CDATA[AD_srv]]></NETWORK> <NETWORK_ID><![CDATA[12]]></NETWORK_ID> <NETWORK_UNAME><![CDATA[oneadmin]]></NETWORK_UNAME> <NIC_ID><![CDATA[0]]></NIC_ID> <PARENT_NETWORK_ID><![CDATA[3]]></PARENT_NETWORK_ID> <PHYDEV><![CDATA[enp3s0]]></PHYDEV> <SECURITY_GROUPS><![CDATA[0]]></SECURITY_GROUPS> <TARGET><![CDATA[one-255-0]]></TARGET> <VLAN_ID><![CDATA[5]]></VLAN_ID> <VN_MAD><![CDATA[vxlan]]></VN_MAD> </NIC> <NIC_DEFAULT> <MODEL><![CDATA[virtio]]></MODEL> </NIC_DEFAULT> <OS> <ARCH><![CDATA[x86_64]]></ARCH> <BOOT><![CDATA[disk0]]></BOOT> </OS> <SECURITY_GROUP_RULE> <PROTOCOL><![CDATA[ALL]]></PROTOCOL> <RULE_TYPE><![CDATA[OUTBOUND]]></RULE_TYPE> <SECURITY_GROUP_ID><![CDATA[0]]></SECURITY_GROUP_ID> <SECURITY_GROUP_NAME><![CDATA[default]]></SECURITY_GROUP_NAME> </SECURITY_GROUP_RULE> <SECURITY_GROUP_RULE> <PROTOCOL><![CDATA[ALL]]></PROTOCOL> <RULE_TYPE><![CDATA[INBOUND]]></RULE_TYPE> <SECURITY_GROUP_ID><![CDATA[0]]></SECURITY_GROUP_ID> <SECURITY_GROUP_NAME><![CDATA[default]]></SECURITY_GROUP_NAME> </SECURITY_GROUP_RULE> <TEMPLATE_ID><![CDATA[21]]></TEMPLATE_ID> <VCPU><![CDATA[1]]></VCPU> <VMID><![CDATA[255]]></VMID> </TEMPLATE> <USER_TEMPLATE> <ANSIBLE_ONE_ALIAS><![CDATA[observeiaasfoss]]></ANSIBLE_ONE_ALIAS> <DOMAIN><![CDATA[.on.ps.loc]]></DOMAIN> <ERROR><![CDATA[Wed Jan 30 08:09:55 2019 : Error live migrating VM: Could not migrate one-255 to one2]]></ERROR> <HYPERVISOR><![CDATA[kvm]]></HYPERVISOR> <INPUTS_ORDER><![CDATA[DESCRIPTION,DOMAIN,ANSIBLE_ONE_ALIAS]]></INPUTS_ORDER> <LABELS><![CDATA[config/dns_service,config/os_basics,config/update_os,config/monitor_service]]></LABELS> <LOGO><![CDATA[images/logos/ubuntu.png]]></LOGO> <MEMORY_UNIT_COST><![CDATA[MB]]></MEMORY_UNIT_COST> <READY><![CDATA[YES]]></READY> <SCHED_DS_REQUIREMENTS><![CDATA[ID="104"]]></SCHED_DS_REQUIREMENTS> <SERVERPREFIX><![CDATA[srv]]></SERVERPREFIX> <SERVICE_NAME><![CDATA[prometheus]]></SERVICE_NAME> <USER_INPUTS> <ANSIBLE_ONE_ALIAS><![CDATA[O|text|DNS Alias]]></ANSIBLE_ONE_ALIAS> <CPU><![CDATA[O|fixed|| |1]]></CPU> <DESCRIPTION><![CDATA[O|text|Description]]></DESCRIPTION> <DOMAIN><![CDATA[M|list|DNS domain|.on.ps.loc|.on.ps.loc]]></DOMAIN> <MEMORY><![CDATA[M|range||128..4096|1024]]></MEMORY> <VCPU><![CDATA[O|range||1..4|1]]></VCPU> </USER_INPUTS> </USER_TEMPLATE> <HISTORY_RECORDS> <HISTORY> <OID>255</OID> <SEQ>0</SEQ> <HOSTNAME>one1</HOSTNAME> <HID>3</HID> <CID>100</CID> <STIME>1548831425</STIME> <ETIME>1548831600</ETIME> <VM_MAD><![CDATA[kvm]]></VM_MAD> <TM_MAD><![CDATA[ssh]]></TM_MAD> <DS_ID>104</DS_ID> <PSTIME>1548831425</PSTIME> <PETIME>1548831488</PETIME> <RSTIME>1548831488</RSTIME> <RETIME>1548831600</RETIME> <ESTIME>0</ESTIME> <EETIME>0</EETIME> <ACTION>2</ACTION> <UID>0</UID> <GID>0</GID> <REQUEST_ID>1792</REQUEST_ID> </HISTORY> <HISTORY> <OID>255</OID> <SEQ>1</SEQ> <HOSTNAME>one3</HOSTNAME> <HID>2</HID> <CID>100</CID> <STIME>1548831587</STIME> <ETIME>1548831600</ETIME> <VM_MAD><![CDATA[kvm]]></VM_MAD> <TM_MAD><![CDATA[ssh]]></TM_MAD> <DS_ID>104</DS_ID> <PSTIME>0</PSTIME> <PETIME>0</PETIME> <RSTIME>0</RSTIME> <RETIME>0</RETIME> <ESTIME>0</ESTIME> <EETIME>0</EETIME> <ACTION>2</ACTION> <UID>0</UID> <GID>0</GID> <REQUEST_ID>1792</REQUEST_ID> </HISTORY> <HISTORY> <OID>255</OID> <SEQ>2</SEQ> <HOSTNAME>one1</HOSTNAME> <HID>3</HID> <CID>100</CID> <STIME>1548831600</STIME> <ETIME>1548832195</ETIME> <VM_MAD><![CDATA[kvm]]></VM_MAD> <TM_MAD><![CDATA[ssh]]></TM_MAD> <DS_ID>104</DS_ID> <PSTIME>0</PSTIME> <PETIME>0</PETIME> <RSTIME>1548831600</RSTIME> <RETIME>1548832195</RETIME> <ESTIME>0</ESTIME> <EETIME>0</EETIME> <ACTION>2</ACTION> <UID>0</UID> <GID>0</GID> <REQUEST_ID>7072</REQUEST_ID> </HISTORY> <HISTORY> <OID>255</OID> <SEQ>3</SEQ> <HOSTNAME>one2</HOSTNAME> <HID>1</HID> <CID>100</CID> <STIME>1548832183</STIME> <ETIME>1548832195</ETIME> <VM_MAD><![CDATA[kvm]]></VM_MAD> <TM_MAD><![CDATA[ssh]]></TM_MAD> <DS_ID>104</DS_ID> <PSTIME>0</PSTIME> <PETIME>0</PETIME> <RSTIME>0</RSTIME> <RETIME>0</RETIME> <ESTIME>0</ESTIME> <EETIME>0</EETIME> <ACTION>2</ACTION> <UID>0</UID> <GID>0</GID> <REQUEST_ID>7072</REQUEST_ID> </HISTORY> <HISTORY> <OID>255</OID> <SEQ>4</SEQ> <HOSTNAME>one1</HOSTNAME> <HID>3</HID> <CID>100</CID> <STIME>1548832195</STIME> <ETIME>0</ETIME> <VM_MAD><![CDATA[kvm]]></VM_MAD> <TM_MAD><![CDATA[ssh]]></TM_MAD> <DS_ID>104</DS_ID> <PSTIME>0</PSTIME> <PETIME>0</PETIME> <RSTIME>1548832195</RSTIME> <RETIME>0</RETIME> <ESTIME>0</ESTIME> <EETIME>0</EETIME> <ACTION>0</ACTION> <UID>-1</UID> <GID>-1</GID> <REQUEST_ID>-1</REQUEST_ID> </HISTORY> </HISTORY_RECORDS> </VM>

rp- commented

I think the problem is not the linstor disk, but what kind of disk is the second one?

rp- commented

I think I fixed that issue with c397603.
Maybe you could try the current master and report back.

I used the master branch but Migration is still not working. The 2nd disk is a volatile swap disk.
You may be able to easily reproduce the failed migration by adding this disk to a template

DISK = [
FORMAT = "raw",
SIZE = "1024",
TYPE = "swap" ]

BTW, with my OS disks on shared nfs datastores the migration is working fine (even with the swap as 2nd disk)

rp- commented

Ok thanks, I could reproduce the bug.
The thing is, we didn't support volatile disks yet, I implemented it now and I'm currently testing.
But I could not get live migration with volatile disks working, I always get KVM error:
Unsafe migration: Migration without shared storage is unsafe
Those volatile disks are somehow not detected as shareable for KVM and I have not found a way to set correct options in opennebula to make KVM happy about that.
Raw images seem to be fine for KVM.

I'm not quiet sure why your NFS shared just works, but there seem some detection within KVM that marks files on such mount points safe.

rp- commented

Pushed now support for volatile disks and will probably create a new version tomorrow

Finally was able to create linstor System ds and can confirm live migtration of VMs with volatile disks is working (in my case without the -unsafe option, Ubuntu 18.04). Well done!