CiscoDevNet/cloud-cml

Error: CML2 Provider Error

audacious-lab opened this issue · 8 comments

terraform apply -auto-approve

....
module.ready.data.cml2_system.state: Still reading... [9m30s elapsed]
module.ready.data.cml2_system.state: Still reading... [9m40s elapsed]
module.ready.data.cml2_system.state: Still reading... [9m50s elapsed]
module.ready.data.cml2_system.state: Still reading... [10m0s elapsed]

Planning failed. Terraform encountered an error while generating this plan.

╷
│ Error: CML2 Provider Error
│
│   with module.ready.data.cml2_system.state,
│   on module-cml2-readyness/main.tf line 7, in data "cml2_system" "state":
│    7: data "cml2_system" "state" {
│
│ ran into timeout (max 10m)
╵

from debug

> 2023-07-24T17:08:02.738-0400 [DEBUG] provider.terraform-provider-cml2_v0.6.2: Called provider defined DataSource Read: tf_provider_addr=registry.terraform.io/ciscodevnet/cml2 tf_rpc=ReadDataSource @module=sdk.framework tf_data_source_type=cml2_system tf_req_id=18214339-4656-48da-ccbb-04013ef49854 @caller=github.com/hashicorp/terraform-plugin-framework@v1.2.0/internal/fwserver/server_readdatasource.go:76 timestamp=2023-07-24T17:08:02.738-0400
> 2023-07-24T17:08:02.738-0400 [TRACE] provider.terraform-provider-cml2_v0.6.2: Received downstream response: diagnostic_error_count=1 diagnostic_warning_count=0 tf_data_source_type=cml2_system tf_req_duration_ms=602886 tf_req_id=18214339-4656-48da-ccbb-04013ef49854 @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/internal/tf6serverlogging/downstream_request.go:37 @module=sdk.proto tf_proto_version=6.3 tf_provider_addr=registry.terraform.io/ciscodevnet/cml2 tf_rpc=ReadDataSource timestamp=2023-07-24T17:08:02.738-0400
> 2023-07-24T17:08:02.739-0400 [ERROR] provider.terraform-provider-cml2_v0.6.2: Response contains error diagnostic: tf_proto_version=6.3 tf_req_id=18214339-4656-48da-ccbb-04013ef49854 tf_rpc=ReadDataSource diagnostic_detail="ran into timeout (max 10m)" tf_data_source_type=cml2_system @module=sdk.proto diagnostic_severity=ERROR diagnostic_summary="CML2 Provider Error" tf_provider_addr=registry.terraform.io/ciscodevnet/cml2 @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/internal/diag/diagnostics.go:55 timestamp=2023-07-24T17:08:02.738-0400
> 2023-07-24T17:08:02.739-0400 [TRACE] provider.terraform-provider-cml2_v0.6.2: Served request: tf_req_id=18214339-4656-48da-ccbb-04013ef49854 @module=sdk.proto tf_data_source_type=cml2_system tf_proto_version=6.3 tf_provider_addr=registry.terraform.io/ciscodevnet/cml2 tf_rpc=ReadDataSource @caller=github.com/hashicorp/terraform-plugin-go@v0.15.0/tfprotov6/tf6server/server.go:668 timestamp=2023-07-24T17:08:02.738-0400
> 2023-07-24T17:08:02.739-0400 [ERROR] vertex "module.ready.data.cml2_system.state" error: CML2 Provider Error
> 2023-07-24T17:08:02.739-0400 [TRACE] vertex "module.ready.data.cml2_system.state": visit complete, with errors
> 2023-07-24T17:08:02.740-0400 [TRACE] dag/walk: upstream of "root" errored, so skipping
> 2023-07-24T17:08:02.740-0400 [TRACE] vertex "module.ready.data.cml2_system.state (expand)": dynamic subgraph encountered errors: CML2 Provider Error
> 2023-07-24T17:08:02.740-0400 [ERROR] vertex "module.ready.data.cml2_system.state (expand)" error: CML2 Provider Error
> 2023-07-24T17:08:02.740-0400 [TRACE] vertex "module.ready.data.cml2_system.state (expand)": visit complete, with errors
> 2023-07-24T17:08:02.740-0400 [TRACE] dag/walk: upstream of "module.ready.output.state (expand)" errored, so skipping
> 2023-07-24T17:08:02.740-0400 [TRACE] dag/walk: upstream of "module.ready (close)" errored, so skipping
> 2023-07-24T17:08:02.740-0400 [TRACE] dag/walk: upstream of "provider[\"registry.terraform.io/ciscodevnet/cml2\"] (close)" errored, so skipping
> 2023-07-24T17:08:02.740-0400 [TRACE] dag/walk: upstream of "output.cml2info (expand)" errored, so skipping
> 2023-07-24T17:08:02.740-0400 [TRACE] dag/walk: upstream of "root" errored, so skipping
> 2023-07-24T17:08:02.740-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/ciscodevnet/cml2"
> 2023-07-24T17:08:02.740-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/aws"
> 2023-07-24T17:08:02.740-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/random"
> 2023-07-24T17:08:02.745-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/ciscodevnet/cml2"
> 2023-07-24T17:08:02.745-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/aws"
> 2023-07-24T17:08:02.745-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/random"
> 2023-07-24T17:08:02.745-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/ciscodevnet/cml2"
> 2023-07-24T17:08:02.745-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/aws"
> 2023-07-24T17:08:02.745-0400 [TRACE] LoadSchemas: retrieving schema for provider type "registry.terraform.io/hashicorp/random"
> 2023-07-24T17:08:02.813-0400 [TRACE] statemgr.Filesystem: removing lock metadata file .terraform.tfstate.lock.info
> 2023-07-24T17:08:02.813-0400 [TRACE] statemgr.Filesystem: unlocking terraform.tfstate using fcntl flock
> 2023-07-24T17:08:02.814-0400 [DEBUG] provider.stdio: received EOF, stopping recv loop: err="rpc error: code = Unavailable desc = error reading from server: EOF"
> 2023-07-24T17:08:02.815-0400 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/ciscodevnet/cml2/0.6.2/darwin_amd64/terraform-provider-cml2_v0.6.2 pid=74995
> 2023-07-24T17:08:02.815-0400 [DEBUG] provider: plugin exited

This is typically the case when the .deb image is either not available or when the package name in the configuration file doesn't match the actual filename. See the deb attribute below -- it must match the debian package name that you put into the S3 bucket.

To further troubleshoot, you can go to the device console of the cloud server and see where it hangs. You can also check /var/log/cloud-init.log. I am not 100% certain about the exact file name -- but it's in /var/log.

app:
  user: admin
  pass: your-secret-password
  # need to escape special chars:
  # pass: '\"!@$%'
  deb: cml2_2.5.1-10_amd64.deb    <=== this filename must match the filename in the S3 bucket
  # list must have at least ONE element, this is what the dummy is for in case
  # 00- and 01- are commented out!
  customize:
    # - 00-patch_vmx.sh
    # - 01-patty.sh
    - 99-dummy.sh

Hi @rschmied,

See outputs below

config.yml

aws:
    bucket: my-s3-aws-bucket
app:
    deb: cml2_2.6.0-5_amd64.deb

aws s3 ls --recursive s3://my-s3-aws-bucket

2023-07-24 15:01:54   84457012 cml2_2.6.0-5_amd64.deb
2023-07-24 14:14:28        258 refplat/virl-base-images/iosv-159-3-m6/iosv-159-3-m6.yaml
2023-07-24 14:14:28   57309696 refplat/virl-base-images/iosv-159-3-m6/vios-adventerprisek9-m.spa.159-3.m6.qcow2
2023-07-24 14:14:41        267 refplat/virl-base-images/iosvl2-2020/iosvl2-2020.yaml
2023-07-24 14:14:41   90409984 refplat/virl-base-images/iosvl2-2020/vios_l2-adventerprisek9-m.ssa.high_iron_20200929.qcow2
2023-07-24 14:24:19        242 refplat/virl-base-images/server-tcl-13-1/server-tcl-13-1.yaml
2023-07-24 14:24:19   21495808 refplat/virl-base-images/server-tcl-13-1/tcl-13-1.qcow2

/var/log/cloud-init-output.log

Setting up awscli (1.18.69-1ubuntu0.20.04.1) ...
fatal error: Unable to locate credentials
...
fatal error: Unable to locate credentials
Reading package lists...
E: Unsupported file /provision/cml2_2.6.0-5_amd64.deb given on commandline
Failed to stop virl2.target: Unit virl2.target not loaded.
CML is not active, license can not be de-registered!
Failed to restart virl2.target: Unit virl2.target not found.
Waiting for controller to be ready...
...
Waiting for controller to be ready...

ubuntu@cml-controller:~$ ls /provision/ -la

total 20
drwxr-xr-x  2 root     root     4096 Jul 25 12:59 .
drwxr-xr-x 21 root     root     4096 Jul 25 12:59 ..
-rw-r--r--  1 root     root      136 Jul 25 12:59 99-dummy.sh
-rwx------  1 sysadmin sysadmin 1225 Jul 25 12:59 del.sh
-rw-r--r--  1 root     root      131 Jul 25 12:59 refplat

ubuntu@cml-controller:~$ ls /provision/refplat -la
-rw-r--r-- 1 root root 131 Jul 25 12:59 /provision/refplat

Is it possible that your aws client doesn't work and can't copy stuff from the S3 bucket into the CML instance? the "unable to locate credentials" and also the missing .deb package makes me think that your S3 access policy and the IAM policy that references it (which gets passed to the EC2 instance) might be wrong or missing.

Hi @rschmied,

Could you please clarify on how AWS credentials (access_key and secret_key) are being copied and configured on the created Ubuntu instance?

see screenshots of user's permission configuration

User cml_terraform

image image

Role s3-access-for-ec2

image

@audacious-lab looking into it -- you might want to redact your account number from the screen shots. At first glance, this looks OK to me. I'll dig into ti.

What TZ are you in? I can't see anything wrong with your policies/roles from the screen shots provided. I compared them to mine and they look pretty much the same. Maybe your S3 bucket layout is different? Anyway, maybe we can get on a WebEx and compare live? If you agree, send me an email at my GH ID at Cisco.com.

Hi @rschmied,

I really appreciate your time for WebEx session with me. As we have already discussed, the AWS client is not configured for the correct role on the created instance.

ubuntu@cml-controller:~$ aws configure list
      Name                    Value             Type    Location
      ----                    -----             ----    --------
   profile                <not set>             None    None
access_key                <not set>             None    None
secret_key                <not set>             None    None
    region                <not set>             None    None

Hi @rschmied,

Really appreciate the detailed video you have shared. I was able to follow your steps and identified the issue.

When I was creating a role s3-access-for-ec2 I've selected S3 service instead of EC2. So Trusted Relationship was set to S3 Principal Service, but not EC2

Should be "Service": "ec2.amazonaws.com" and this is what I had


"Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "s3.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]