contiv-experimental/cluster

While node is getting commissioned, we should not allow commissioning of the same again

rkharya opened this issue · 6 comments

Node commissioning command should not be accepted while node is already under commissioning routine. Currently user can issue commissioning command without any error while its under process of commissioning.

@rkharya thanks for reporting the issue. However, what you report already exists i.e. commands that try to commission/decommission as node while there is ongoing provision block until it completes or errors. Also there was a recent enhancement to API behavior where the commands no longer block and error right away stating that there is an active provisioning job ( #96 ).

Can you list the steps to reproduce?

Also if it's easier for you, can you try with a recent release of clusterm (set contiv_cluster_version: "v0.0.0-04-23-2016.08-24-46.UTC"). That way you can checkout the new API behavior. I will update ansible defaults shortly.

Steps to repro-

  1. Commission a node
  2. While node commissioning is going on, repeat the same command again(same node commissioing)
  3. command does get accepted.

Ask was to error out the command output with appropriate error message. Currently it goes through w/o warning/error message, but we see clusterm sysctl logs get FATAL error message recorded, saying - cannot change state from 'Allocated' to 'Provisioning'

Sure will try out with latest contiv_cluster to understand API behavior changes.

Thanks.

  1. command does get accepted.

ok...one last confirmation, did the command block? OR did it return immediately with no error? I would expect it to block with old behavior and eventually fail with error.

clusterm sysctl logs get FATAL error message recorded, saying - cannot change state from 'Allocated' to 'Provisioning'

this makes me believe that command indeed failed (and it should have reported the same on the screen). But the API behavior has changed now, it will immediately fail instead of blocking.

Let's keep this open, I think I need to add more tests in this area.

It did not blocked. It returned without any error. So it appeared it got accepted till we get to know through clusterm logs that it does fails with Fatal error message.

Ideally, it should report the FATAL error message as the outcome of the command execution. So user know for sure.

@rkharya

Ideally, it should report the FATAL error message as the outcome of the command execution. So user know for sure.

yes, it does unless there is a bug. I have verified it manually myself but I don't have tests for it now. I will be adding more tests shortly as part of my upcoming changes, let's keep this open till then.

@rkharya I have added a system-test to test this behavior (i.e. cli/api errors out immediately if there is active job) and it is passing.

I will close this issue, feel free to reopen or submit another one if you still see the problem.