dell/PyU4V

Getting error "The requested host resource already exists" for modify operation

Closed this issue · 8 comments

We are working on automating workflows for Hosts.

When we run the automation in the following sequence:
1.Try to create a host with incorrect initiators through ansible playbook - The playbook errors out but creates an empty host.
2. Try to modify the created empty host using ansible playbook with valid initiators through automation. This operation fails with below message:
"msg": "Create host host1_ansible_test_Host failed with error Bad or unexpected response from the storage volume backend API: Error POST symmetrix resource. The status code received is 409 and the message is {'message': 'The requested host resource already exists.'}."

Note:
above 2 steps need to happen in quick succession for reproduction. However, when there is a good pause (~20 secs) between step1 and 2, the operation succeeds.

Based on the above observation i.e., when there is a good pause between tasks, it works as expected, we are suspecting that host is not yet created while we are trying to modify the same. But error says "Host resource already exists".

Can you please help us in understanding the behavior in this scenario. The reason why we are getting this error in a modify scenario

Details about the environment:
PYU4V : 9.2.1.4
U4P: 9.2.0.x

Hi Rajendra,
Just a guess at this stage but if you are not enabling _async = True, you may need to do so
https://github.com/dell/PyU4V/blob/master/PyU4V/provisioning.py#L170
https://github.com/dell/PyU4V/blob/master/PyU4V/provisioning.py#L200

Based on what you are saying, the POST "create error msg" is coming in after the modification (which by right it should not do) but it might be worth checking the _async option out to see if it changes the behavior. That said for the default behavior(_async=False) it should not return until the host is created successfully.

If that does not work, could you post what parameters you are sending to both the create_host and modify_host methods and we can try to reproduce.

Thank you
Helen

Hi @helenwalsh ,
Thanks for responding.

We are not passing any value for _async from our modules which means it is picking by default _async=False.
As mentioned, for create operation with invalid initiators, it throws an error but creates an empty host which we verified in Unisphere.

We can test the behavior by passing _async=True but need to check if it meets our design criteria. Will let you know once we test this way.
Providing you the details of the inputs for create and modify operations:

Create scenario:
host_name = 'abc'
initiators=[Invalid initiators],
host_flags={
'volume_set_addressing': {'enabled': False, 'override': True},
'environ_set': {'enabled': False, 'override': True},
'disable_q_reset_on_ua': {'enabled': False, 'override': True},
'openvms': {'enabled': False, 'override': True},
'avoid_reset_broadcast': {'enabled': False, 'override': True},
'scsi_3': {'enabled': False, 'override': True},
'spc2_protocol_version': {'enabled': True, 'override': True},
'scsi_support1': {'enabled': False, 'override': True},
'consistent_lun': True
}

Modify Scenario:

host_name = 'abc',
initiators = [Valid initiators],
host_flags:
{
volume_set_addressing: false,
disable_q_reset_on_ua: false,
environ_set: false,
openvms: false,
avoid_reset_broadcast: false,
scsi_3: false,
spc2_protocol_version: true,
scsi_support1: false,
consistent_lun: true
}

It would be great if you can help us with understanding the root cause for this behavior.
One more input is it works fine for arrays which are little slow.. it does not work on arrays which are fast.

Thanks
Rajendra

Hi Rajendra,
Thank you for all the details. It seems if the initiators are invalid in the create scenario then an empty host should not be created and a modify should not be attempted. This is a cleanup issue and we will speak with the REST team.

However I am a little concerned why the error on the modify is POST error when it should be a PUT. This suggests a delayed response from the create host operation (which should not happen on a sync operation) or just another error we need to speak with the REST team on

From above:
2. Try to modify the created empty host using ansible playbook with valid initiators through automation. This operation fails with below message:
"msg": "Create host host1_ansible_test_Host failed with error Bad or unexpected response from the storage volume backend API: Error POST symmetrix resource. The status code received is 409 and the message is {'message': 'The requested host resource already exists.'}."

Few questions:

  1. Do you check that the host does not previously exist before attempting a create host operation?
  2. Do you check the status of the create before attempting the modify, how do you know the initiators are invalid and why do you attempt the modify host to add valid initiators?
  3. Would it be possible to try _async on the create operation to see if you are getting the same behavior on both valid and invalid initiators?

Create scenario:
host_name = 'abc'
initiators=[Invalid initiators],
host_flags={
'volume_set_addressing': {'enabled': False, 'override': True},
'environ_set': {'enabled': False, 'override': True},
'disable_q_reset_on_ua': {'enabled': False, 'override': True},
'openvms': {'enabled': False, 'override': True},
'avoid_reset_broadcast': {'enabled': False, 'override': True},
'scsi_3': {'enabled': False, 'override': True},
'spc2_protocol_version': {'enabled': True, 'override': True},
'scsi_support1': {'enabled': False, 'override': True},
'consistent_lun': True
}
_async=True

In the interim I will try to reproduce the issue you are seeing to see if there is anything we can do in PyU4V92.

Hi Rajendra,
Apologies for the delay in getting back to you. I wrote the following to see if I could reproduce. I am using Unisphere for PowerMax V9.2.1.8
I am going to ask the REST team to see if an empty host should be created if the initiator(s) is invalid.
That said I had no issue with modifying the empty host to add a valid initiator. I did not enable _async=True

I would suggest upgrading to V9.2.1.8 to see if you can reproduce.
Other than, I think doing a get on the host just after a create might slow things down. If it does exist, then you can modify it.
It doesn't look like there is any issue with the modify_host in this version, and I did not get the POST error you mention above which should not exist on a PUT(modify) operation.

def test_create_host_with_invalid_initiator(self):
    """Test create_host with initiator list."""

    invalid_initiator_list = ['invalid_initiator']
    host_name = None
    host_flags = {
        'volume_set_addressing': {'enabled': False, 'override': True},
        'environ_set': {'enabled': False, 'override': True},
        'disable_q_reset_on_ua': {'enabled': False, 'override': True},
        'openvms': {'enabled': False, 'override': True},
        'avoid_reset_broadcast': {'enabled': False, 'override': True},
        'scsi_3': {'enabled': False, 'override': True},
        'spc2_protocol_version': {'enabled': True, 'override': True},
        'scsi_support1': {'enabled': False, 'override': True},
        'consistent_lun': True
    }
    try:
        host_name = self.generate_name('host')
        host_details = self.provision.create_host(
            host_name, invalid_initiator_list, host_flags)
    except exception.VolumeBackendAPIException as ex:
        # This will except, exception message is:
        # ad or unexpected response from the storage volume backend API: 
        # Error POST symmetrix resource. The status code received is 500 
        # and the message is {'message': 'A problem occurred creating the 
        # host resource: Error for: 000297600448/PyU4V-host-dogipexavi: 
        # One or more arguments are invalid'}.
        print(str(ex))

    host_list = self.provisioning.get_host_list()
    # An empty host has been created as you have highlighted
    self.assertIn(host_name, host_list)
    host_details = self.provisioning.get_host(host_name)
    # Proof that it is an empty host
    self.assertEqual(0, host_details[constants.NUM_OF_INITIATORS])

    # Get a valid initiator
    available_initiator = (
        self.provisioning.get_available_initiator_wwn_as_list())
    try:
        self.provisioning.modify_host(host_name,
                                      add_init_list=available_initiator)
    except exception.VolumeBackendAPIException as ex:
        print(str(ex))
    host_details = self.provisioning.get_host(host_name)
    # No exception here, the valid initiator has been successfully added
    self.assertEqual(1, host_details[constants.NUM_OF_INITIATORS])
    self.assertIn(host_name, host_list)

Hi @helenwalsh ,

Thanks for the response.
We will try replicating the same on Unisphere V9.2.1.8 as you suggested.

Would like to give few more details about the replication. Please find our observations below:

  1. It works fine when we run the create operation (with invalid initiators) and after that modify operation on that empty host (created when we try create host operation with invalid initiators) separately. - No issue
  2. It also works fine when we run our automation on an array which is slow without giving any pause between the Ansible tasks - No issue
  3. It also works fine when we run the same automation scenario on an array which is fast by giving pause (~30 secs) between the Ansible tasks - No issue
  4. It does not work fine when we run this automation scenario on an array which is fast without giving pause between Ansible tasks - Issue

Info on invalid initiators scenario:
This is a negative scenario in our test cases where we try creating a host with invalid initiators ( a combination of FC and iSCSI initiators which is not valid. We should pass either all FC or all iSCSI initiators)

Please let us know once the REST team gets back to you on if empty host should be created when we try creating Host with invalid combination of initiators,

Thanks
Rajendra

Hi Rajendra,
I spoke with the REST team and there is no immediate plan to do rollback of hosts if the initiators are invalid, as apparently it is a big undertaking. I was not able to reproduce it on my system (without breaks) but that is probably because we have a similar environment to point 2. above.
2. It also works fine when we run our automation on an array which is slow without giving any pause between the Ansible tasks - No issue

I'm not sure if you tried with the _async=True option yet. Apologies if you already have, but I feel it might be worth a try.
The difference here is a job id is returned (straight away) and is polled until it completes. Otherwise the REST call does not return until it completes (at least that should be the case). Please let me know if you have tried this.
Kind regards,
Helen

@helenwalsh Thanks for your inputs. We tried with _async=True and it works when we test it independently. But as per our automation environment, we run multiple tasks in a single playbook which involve idempotent cases and also dependent test cases (Eg: modifying a host group) .So in our case if we don't put a delay between the tasks even if it is async call, automation runs in this scenario are failing with the same message.

Thanks
Rajendra

Rajendra, I'm closing this out as the issue is open since 2021 the problem also appears to be outside of PyU4V