hashicorp/terraform-provider-aws

Creating aws_elasticsearch_domain can't be done due to absence of AWSServiceRoleForAmazonElasticsearchService role

sarunask opened this issue ยท 18 comments

Community Note

  • Please vote on this issue by adding a ๐Ÿ‘ reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

Terraform v0.11.7

  • provider.archive v1.0.3
  • provider.aws v1.27.0
  • provider.null v1.0.0
  • provider.random v1.3.1

Affected Resource(s)

  • aws_elasticsearch_domain

Terraform Configuration Files

resource "aws_elasticsearch_domain" "es" {
  domain_name           = "${substr(random_pet.random_pet_name.id,0,28)}"
  elasticsearch_version = "6.2"

  # Anyone in
  access_policies = "${data.aws_iam_policy_document.es_policy.json}"

  cluster_config {
    instance_type  = "${var.es_instance_size}"
    instance_count = "${var.es_instance_count}"
  }

  ebs_options {
    ebs_enabled = true
    volume_type = "gp2"
    volume_size = "${var.es_eb_disk_size}"
  }

  vpc_options {
    subnet_ids         = ["${element(aws_db_subnet_group.elasticsearch_sb.subnet_ids, 0)}"]
    security_group_ids = ["${aws_security_group.allow_all.id}"]
  }

  snapshot_options {
    automated_snapshot_start_hour = 23
  }

  tags {
    Name        = "${random_pet.random_pet_name.id}"
    component   = "${var.component}"
    description = "${var.es_description}"
  }
}

Debug Output

https://gist.github.com/sarunask/69b7e612d92ee992d7a70f506623f35f

Panic Output

No panic

Expected Behavior

ES Cluster created

Actual Behavior

Terraform gave error:

  • aws_elasticsearch_domain.es: Error reading IAM Role AWSServiceRoleForAmazonElasticsearchService: NoSuchEntity: The user with name AWSServiceRoleForAmazonElasticsearchService cannot be found.
    status code: 404, request id: 2fe1c895-89d7-11e8-8212-c38ddc7e67d2

Steps to Reproduce

  1. terraform apply

Important Factoids

In current AWS account there are no previous ElasticSearch clusters.

References

  • #0000

Not even creating a microcluster from the AWS Console seemed to help.

Looking at https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/slr-es.html, I get the impression that this error ONLY (?) occurs if/when a VPC is specified for ES.

Yes, indeed when I create my micro-cluster and instead choose "VPC access", then I get the IAM Role AWSServiceRoleForAmazonElasticsearchService line.

The IAM role was then created, and while the one I created in the AWS console is building (will take ten, fifteen minutes), I'm running Terraform to create my "real" cluster with Terraform.

Looking at the IAM role, it have the AmazonElasticsearchServiceRolePolicy policy attached and it looks like this:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1480452973134",
            "Action": [
                "ec2:CreateNetworkInterface",
                "ec2:DeleteNetworkInterface",
                "ec2:DescribeNetworkInterfaces",
                "ec2:ModifyNetworkInterfaceAttribute",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcs"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

... with the trust entity es.amazonaws.com:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "es.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

... in the path /aws-service-role/es.amazonaws.com/.

You can just add this ressource before creating your domain:

resource "aws_iam_service_linked_role" "es" {
  aws_service_name = "es.amazonaws.com"
}

This will create the needed role for ES

Seems that aws_iam_service_linked_role resource fails if the role has already been created outside the current terraform state which makes this hard to couple with the creation of an es cluster.

@SylH You save my day! Thank you!

You can just add this ressource before creating your domain:

resource "aws_iam_service_linked_role" "es" {
  aws_service_name = "es.amazonaws.com"
}

It may also need to add a dependency on this role from ES domains like

resource "aws_elasticsearch_domain" "es" {
  depends_on = ["aws_iam_service_linked_role.es"]
  ...
}

It will prevent issues on destroy:

Error: Error applying plan:

1 error(s) occurred:

* aws_iam_service_linked_role.es (destroy): 1 error(s) occurred:

* aws_iam_service_linked_role.es: Error waiting for role (arn:aws:iam::<...>:role/aws-service-role/es.amazonaws.com/AWSServiceRoleForAmazonElasticsearchService) to be deleted: unexpected state 'FAILED', wanted target 'SUCCEEDED'. last error: %!s(<nil>)

The error on destroy is raised because the role cannot be destroyed if there's one ES domain in the VPC. ES domain must be destroyed before the role.

I had the same issue and I got over it by creating (through the GUI) a test ES domain first, that created the AWSServiceRoleForAmazonElasticsearchService and after that I was able to apply my terraform code successfully. It is kind of the obvious workaround but I'm commenting this anyway in case someone else got stuck.

We just ran into this now and it looks like the createAwsElasticsearchIAMServiceRoleIfMissing is wrong because it's checking the error message for "Role not found" while the error message appears to contain "cannot be found" instead.

This is an easy fix, however I'm not sure of the benefit of doing so as we don't really automatically create service linked roles on the first run anywhere else and it's going to be an absolute pain to test that this works nicely in an automated way.

I'd recommend that we pull this out entirely and straight up recommend that users create the service linked role themselves either in a separate location (as this is an account wide thing and you may have multiple ES clusters in an AWS account) or use depends_on to set the dependency order as mentioned further up in the issue.

This approach then keeps us consistent with other resources with service linked roles available and doesn't add awkward, unexpected things (creating IAM things implicitly is a bit unusual, even if it is a service linked role) and means we don't have an untested block of code in the code base.

@bflad thoughts?

bflad commented

@tomelliff that's a great write up and I'd agree with you -- recommend we remove that code as you suggested (merging in 2.0 of the AWS provider just to be safe) and in its place we should certainly add information on the aws_elasticsearch_domain resource page about the aws_iam_service_linked_role resource, mentioning depends_on, and calling out it needs to happen only once per AWS account to close out this issue. Two separate pull requests (one for the code removal and one for the documentation) would be fantastic.

bflad commented

(I've assigned to myself for tracking, but anyone can feel free to make the above PRs)

@bflad sorry for the delay, made these changes a few days back while on the train but didn't push them until now. Let me know on each of the PRs if there's any issues with them.

@eunomie i still get the same error:

aws_iam_service_linked_role.es: Error waiting for role (arn:aws:iam::REDACTED:role/aws-service-role/es.amazonaws.com/AWSServiceRoleForAmazonElasticsearchService) to be deleted: unexpected state 'FAILED', wanted target 'SUCCEEDED'. last error: %!s(<nil>)

despite having:

resource "aws_elasticsearch_domain" "es" {
  ...
  depends_on = [
    "aws_iam_service_linked_role.es",
  ]
}

Please fix the example:

resource "aws_iam_service_linked_role" "es" {
  aws_service_name = "**elasticsearch**.amazonaws.com"
}

...

bflad commented

The aws_elasticsearch_domain resource documentation page now includes information and an example of using the aws_iam_service_linked_role resource in this case. ๐Ÿ‘

Please fix the example:

resource "aws_iam_service_linked_role" "es" {
  aws_service_name = "**elasticsearch**.amazonaws.com"
}

...

@jgutierrez-adl As per the docs the proper aws_service_name is es.amazonaws.com. Thus, the proper terraform stanza is:

resource "aws_iam_service_linked_role" "es" {
  aws_service_name = "es.amazonaws.com"
}

I'm going to lock this issue because it has been closed for 30 days โณ. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thanks!