Spot instances block_duration_minutes is not supported in AWS anymore?
fzipi360 opened this issue · 4 comments
🗣️ Foreword
Thank you for taking the time to fill out this bug report fully. Without it, we may not be able to fix the bug, and the issue may be closed without resolution.
👻 Brief Description
Per this AWS documentation, the block_duration_minutes
is not available anymore (don't know if only on new accounts).
One course of action here is to use without altering too much is to do some date math and push the same time delta in the block_duration_minutes to the valid_until method. Like, for example, valid_until = Time.now.advance(minutes: block_duration_minutes)
, and then pass that to the corresponding object.
Version
❯ ruby --version
ruby 2.7.6p219 (2022-04-12 revision c9c2245c0a) [x86_64-darwin22]
❯ bundle install
Using concurrent-ruby 1.1.9
Using i18n 1.8.10
Using minitest 5.14.4
Using tzinfo 2.0.4
Using zeitwerk 2.4.2
Using activesupport 6.1.4.1
Using public_suffix 3.1.1
Using addressable 2.5.2
Using ast 2.4.2
Using aws-eventstream 1.2.0
Using aws-partitions 1.701.0
Using aws-sigv4 1.5.2
Using jmespath 1.6.2
Using aws-sdk-core 3.170.0
Using aws-sdk-alexaforbusiness 1.50.0
Using aws-sdk-amplify 1.32.0
Using aws-sdk-apigateway 1.67.0
Using aws-sdk-apigatewayv2 1.36.0
Using aws-sdk-applicationautoscaling 1.51.0
Using aws-sdk-athena 1.41.0
Using aws-sdk-autoscaling 1.63.0
Using aws-sdk-batch 1.47.0
Using aws-sdk-budgets 1.41.0
Using aws-sdk-cloudformation 1.58.0
Using aws-sdk-cloudfront 1.56.0
Using aws-sdk-cloudhsm 1.33.0
Using aws-sdk-cloudhsmv2 1.36.0
Using aws-sdk-cloudtrail 1.38.0
Using aws-sdk-cloudwatch 1.55.0
Using aws-sdk-cloudwatchevents 1.46.0
Using aws-sdk-cloudwatchlogs 1.45.0
Using aws-sdk-codecommit 1.45.0
Using aws-sdk-codedeploy 1.43.0
Using aws-sdk-codepipeline 1.47.0
Using aws-sdk-cognitoidentity 1.31.0
Using aws-sdk-cognitoidentityprovider 1.53.0
Using aws-sdk-configservice 1.66.0
Using aws-sdk-costandusagereportservice 1.34.0
Using aws-sdk-databasemigrationservice 1.53.0
Using aws-sdk-dynamodb 1.63.0
Using aws-sdk-ec2 1.361.0
Using aws-sdk-ecr 1.47.0
Using aws-sdk-ecrpublic 1.6.0
Using aws-sdk-ecs 1.85.0
Using aws-sdk-efs 1.45.0
Using aws-sdk-eks 1.63.0
Using aws-sdk-elasticache 1.62.0
Using aws-sdk-elasticbeanstalk 1.45.0
Using aws-sdk-elasticloadbalancing 1.34.0
Using aws-sdk-elasticloadbalancingv2 1.68.0
Using aws-sdk-elasticsearchservice 1.56.0
Using aws-sdk-eventbridge 1.24.0
Using aws-sdk-firehose 1.42.0
Using aws-sdk-glue 1.88.0
Using aws-sdk-guardduty 1.48.0
Using aws-sdk-iam 1.61.0
Using aws-sdk-kafka 1.41.0
Using aws-sdk-kinesis 1.35.0
Using aws-sdk-kms 1.49.0
Using aws-sdk-lambda 1.69.0
Using aws-sdk-mq 1.40.0
Using aws-sdk-networkfirewall 1.8.0
Using aws-sdk-networkmanager 1.14.0
Using aws-sdk-organizations 1.59.0
Using aws-sdk-ram 1.26.0
Using aws-sdk-rds 1.127.0
Using aws-sdk-redshift 1.69.0
Using aws-sdk-route53 1.55.0
Using aws-sdk-route53domains 1.33.0
Using aws-sdk-route53resolver 1.30.0
Using aws-sdk-s3 1.103.0
Using aws-sdk-secretsmanager 1.46.0
Using aws-sdk-securityhub 1.53.0
Using aws-sdk-servicecatalog 1.60.0
Using aws-sdk-ses 1.41.0
Using aws-sdk-shield 1.41.0
Using aws-sdk-signer 1.32.0
Using aws-sigv2 1.1.0
Using aws-sdk-simpledb 1.29.0
Using aws-sdk-sms 1.32.0
Using aws-sdk-sns 1.45.0
Using aws-sdk-sqs 1.44.0
Using aws-sdk-ssm 1.119.0
Using aws-sdk-states 1.39.0
Using aws-sdk-transfer 1.34.0
Using multipart-post 2.1.1
Using faraday 0.17.4
Using unf_ext 0.0.8
Using unf 0.1.4
Using domain_name 0.5.20190701
Using http-cookie 1.0.4
Using faraday-cookie_jar 0.0.7
Using timeliness 0.3.10
Using ms_rest 0.7.6
Using ms_rest_azure 0.12.0
Using azure_graph_rbac 0.17.2
Using azure_mgmt_key_vault 0.17.7
Using azure_mgmt_resources 0.18.2
Using azure_mgmt_security 0.19.0
Using azure_mgmt_storage 0.23.0
Using bcrypt_pbkdf 1.0.0
Using bundler 2.1.4
Using fuzzyurl 0.9.0
Using tomlrb 1.3.0
Using mixlib-config 2.2.18
Using mixlib-shellout 2.4.4
Using chef-config 13.7.16
Using libyajl2 2.1.0
Using ffi-yajl 2.4.0
Using hashie 3.6.0
Using mixlib-log 1.7.1
Using rack 2.2.3
Using uuidtools 2.1.5
Using chef-zero 13.1.0
Using diff-lcs 1.4.4
Using erubis 2.7.0
Using highline 1.7.10
Using iniparse 1.5.0
Using iso8601 0.9.1
Using mixlib-archive 0.4.20
Using mixlib-authentication 1.4.2
Using mixlib-cli 1.7.0
Using net-ssh 4.2.0
Using net-sftp 2.1.2
Using net-ssh-gateway 2.0.0
Using net-ssh-multi 1.2.1
Using ffi 1.15.5
Using ipaddress 0.8.3
Using plist 3.6.0
Using systemu 2.6.5
Using wmi-lite 1.0.5
Using ohai 13.12.6
Using proxifier 1.0.3
Using rspec-support 3.10.2
Using rspec-core 3.10.1
Using rspec-expectations 3.10.1
Using rspec-mocks 3.10.2
Using builder 3.2.4
Using rspec_junit_formatter 0.2.3
Using multi_json 1.15.0
Using rspec 3.10.0
Using rspec-its 1.3.0
Using net-scp 2.0.0
Using net-telnet 0.1.1
Using sfl 2.3
Using specinfra 2.82.25
Using serverspec 2.41.8
Using syslog-logger 1.6.8
Using chef 13.7.16
Using cleanroom 1.0.0
Using minitar 0.9
Using sawyer 0.8.2
Using octokit 4.21.0
Using retryable 3.0.5
Using molinillo 0.8.0
Using semverse 3.0.0
Using solve 4.0.4
Using thor 0.20.3
Using berkshelf 7.0.8
Using chef-telemetry 1.1.1
Using coderay 1.1.3
Using parallel 1.21.0
Using parser 3.0.2.0
Using rainbow 3.0.0
Using regexp_parser 2.1.1
Using rexml 3.2.5
Using rubocop-ast 1.12.0
Using ruby-progressbar 1.11.0
Using unicode-display_width 2.4.2
Using rubocop 1.22.0
Using cookstyle 7.25.6
Using declarative 0.0.20
Using excon 0.87.0
Using docker-api 2.2.0
Using erubi 1.12.0
Using faraday_middleware 0.14.0
Using jwt 2.3.0
Using memoist 0.16.2
Using os 1.1.1
Using signet 0.15.0
Using googleauth 0.14.0
Using httpclient 2.8.3
Using mini_mime 1.1.2
Using trailblazer-option 0.1.1
Using uber 0.1.0
Using representable 3.1.1
Using retriable 3.1.2
Using google-api-client 0.52.0
Using gssapi 1.3.1
Using gyoku 1.4.0
Using htmlentities 4.3.4
Using inifile 3.0.0
Using json-schema 2.8.1
Using tty-color 0.6.0
Using pastel 0.8.0
Using strings-ansi 0.2.0
Using unicode_utils 1.4.0
Using strings 0.2.1
Using tty-cursor 0.7.1
Using tty-box 0.7.0
Using tty-screen 0.8.1
Using wisper 2.0.1
Using tty-reader 0.9.0
Using tty-prompt 0.23.1
Using license-acceptance 1.0.19
Using method_source 0.9.2
Using parslet 1.8.2
Using pry 0.12.2
Using rubyzip 1.3.0
Using sslshake 1.3.1
Using sync 0.5.0
Using tins 1.29.1
Using term-ansicolor 1.7.1
Using json 2.5.1
Using train-core 3.8.1
Using little-plugger 1.1.4
Using logging 2.3.1
Using nori 2.6.0
Using rubyntlm 0.6.3
Using winrm 2.3.6
Using winrm-fs 1.3.3
Using winrm-elevated 1.2.3
Using train-winrm 0.2.12
Using train 3.8.1
Using train-aws 0.2.20
Using train-habitat 0.2.22
Using tty-table 0.12.0
Using inspec 4.18.51
Using mixlib-versioning 1.2.12
Using mixlib-install 3.12.24
Using test-kitchen 1.25.0
Using kitchen-docker_cli 0.19.0
Using lockfile 2.1.3
Using kitchen-dokken 2.14.0
Using kitchen-ec2 3.15.0
Using kitchen-inspec 1.2.0
Using kitchen-syncgz 1.0.0
Using kitchen-vagrant 1.6.0
Environment
MacOS Ventura.
Scenario
env KITCHEN_LOCAL_YML=../kitchen.yml kitchen test default-ec2-ubuntu-1404
Steps to Reproduce
Using this example config:
---
ec2:
region: us-east-1
associate_public_ip: true
# kitchen values
vpc_id: <my_vpc>
security_group_ids: ["sg-xxxxxxxx"]
# kitchen-public-us-east-1a
subnet_id: "subnet-yyyyyy"
interface: dns
# ec2 instance config
instance_type: t2.micro
spot_price: 0.035
spot_wait: 60
block_duration_minutes: 60
chef:
require_chef_omnibus: false
name: chef
version: 12.5.1
log_level: auto
AWS Account was created recently just for this.
Expected Result
kitchen test to finish properly.
Actual Result
❯ env KITCHEN_LOCAL_YML=../kitchen.yml kitchen test default-ec2-ubuntu-1404
-----> Starting Kitchen (v1.25.0)
$$$$$$ Deprecated configuration detected:
require_chef_omnibus
chef_omnibus_url
Run 'kitchen doctor' for details.
-----> Cleaning up any prior instances of <default-ec2-ubuntu-1404>
-----> Destroying <default-ec2-ubuntu-1404>...
Finished destroying <default-ec2-ubuntu-1404> (0m0.00s).
-----> Testing <default-ec2-ubuntu-1404>
-----> Creating <default-ec2-ubuntu-1404>...
Detected platform: ubuntu version 14.04 on x86_64. Instance Type: t2.micro. Default username: ubuntu (default).
If you are not using an account that qualifies under the AWS
free-tier, you may be charged to run these suites. The charge
should be minimal, but neither Test Kitchen nor its maintainers
are responsible for your incurred costs.
Created automatic key pair kitchen-defaultec2ubuntu1404-username-C02FL11FML85-2023-01-27T18:05:37Z-n8ivs95x
Waited 0/60s for spot request to become fulfilled.
Removing automatic key pair kitchen-defaultec2ubuntu1404-username-C02FL11FML85-2023-01-27T18:05:37Z-n8ivs95x
>>>>>> ------Exception-------
>>>>>> Class: Kitchen::ActionFailed
>>>>>> Message: 1 actions failed.
>>>>>> Failed to complete #create action: [Could not create a spot instance:
BlockDurationMinutes is not a valid parameter. in the specified region us-east-1. Please check this AMI is available in this region.] on default-ec2-ubuntu-1404
>>>>>> ----------------------
>>>>>> Please see .kitchen/logs/kitchen.log for more details
>>>>>> Also try running `kitchen diagnose --all` for configuration
➕ Additional context
Add any other context about the problem here. e.g. related issues or existing pull requests.
@fzipi360 you're correct in that AWS broke this feature.
I ended up writing a lambda function to clean up test-kitchen instances that are >48 hours old, but it's not really designed properly. IMO it's kitchen-ec2's responsibility to own the lifecycle of the systems it creates.
To that end, I was thinking of doing something like this (draw.io link):
This will significantly expand the scope of the APIs that kitchen-ec2 has to have permission over, but I'm of the opinion that these requirements are reasonable.
Any thoughts?
Hey @RulerOf ! We did more or less the same, having a reaper process that ends instances based on kitchen tags that run periodically.
But I think we also need to clean it up to accept that block_duration_minutes
might not be working for the account and react accordingly when creating new hosts. I mean, cleaning up is "easy", but failing to create the test instance breaks the setup for everyone with new accounts in AWS.
But I think we also need to clean it up to accept that
block_duration_minutes
might not be working for the account and react accordingly when creating new hosts.
With the underlying functionality being entirely different, I'm not sure I would want to implement this new approach as a failover for account that doesn't support block_duration_minutes
and would rather force the user to declare a new key in their driver config, e.g. max_instance_lifetime
or terminate_after_minutes
. For accounts that don't support block_duration_minutes
we can rescue
the exception from this specific API call and re-raise
with a message that points users to the new driver parameter(s).
Thinking about this some more, I'd propose two conflicting parameters:
terminate_after_creation_minutes
— Use eventbridge to terminate the instance this many minutes after the instance's creation.terminate_after_idle_minutes
— Use eventbridge to terminate the instance this many minutes after the last time test-kitchen was run. Every time you run test kitchen, if the instance is still alive, the eventbridge timer is updated to extend the instance's lifetime.
I suggest both of these because the former is really easy to understand, but the latter is really honestly how I would prefer it to work.