cunningham-lab/neurocaas

Error: Jobs do not launch without warning (label-job-create-web)

Closed this issue · 1 comments

When running jobs with label-job-create-web, it appears that AWS has changed their handling of the parameter BlockDurationMinutes, handling the dedicated duration spot instances. In the developer side, we see the following message:

  • 'Error': {'Code': 'InvalidParameterValue', 'Message': 'BlockDurationMinutes is not a valid parameter.'}...}
  • This issue could affect just one of our analyses, or multiple.
  • Hypothesis: This issue should only be an issue for jobs where:
    • a duration parameter is not provided (thus defaulting to 20 mins)
    • spot instance acquisition is requested
    • the command ec2_resource.create_instances then fails with the above error code.

Todo:

  • test this hypothesis
  • update the default spot job request length to be 0.
  • update all analyses to be safe.
  • consider replacements for "save" instances.

If hypothesis is true, edit lines 182, 305, 444, in protocols/utilsparam/ec2.py to reduce the default spot job request length, and relaunch.