haidaraM/terraform-jenkins-aws-fargate

issue with target group not attaching to the alb

nbmustafa opened this issue ยท 20 comments

Running into two issues:

1- is the acm validation never get through, it complains about unsupported attributes
Error: Unsupported attribute

  on modules/jenkins_ecs_service/main.tf line 176, in resource "aws_route53_record" "certificate_validation_record":
 176:   name    = list(aws_acm_certificate.master_certificate.0.domain_validation_options)[0].resource_record_name
    |----------------
    | aws_acm_certificate.master_certificate[0].domain_validation_options is set of object with 1 element

This value does not have any attributes.


Error: Unsupported attribute

  on modules/jenkins_ecs_service/main.tf line 177, in resource "aws_route53_record" "certificate_validation_record":
 177:   type    = list(aws_acm_certificate.master_certificate.0.domain_validation_options)[0].resource_record_type
    |----------------
    | aws_acm_certificate.master_certificate[0].domain_validation_options is set of object with 1 element

This value does not have any attributes.

Error: Unsupported attribute

  on modules/jenkins_ecs_service/main.tf line 179, in resource "aws_route53_record" "certificate_validation_record":
 179:   records = [list(aws_acm_certificate.master_certificate.0.domain_validation_options)[0].resource_record_value]
    |----------------
    | aws_acm_certificate.master_certificate[0].domain_validation_options is set of object with 1 element

This value does not have any attributes.

2- to avoid the first error, I try to skip creating ssl certs and removed https listeners on the alb, but then got the following issue? have you tried the master branch? is it working for you?!

Error: InvalidParameterException: The target group with targetGroupArn arn:aws:elasticloadbalancing:us-east-1:xxxxxxxxxxx:targetgroup/sample-develop-jnks-tg/552c41f56f0cb083 does not have an associated load balancer. "jenkins-master"

Hi,
Yes it's working fine for me. Which version of Terraform and AWS provider are you using ? ($ terraform -version)

Hi @haidaraM
I am using this version

terraform {
  required_version = ">= 0.12.6"

  required_providers {
    aws = "~> 3.0"
  }
}

just for the sake of confirmation, I used exact same version as your's just now, it failed for same reason with acm "unsupported attributes"!

Ok I see and what is the output of terraform -version when you run it inside your terraform workspace for this project ?

Your errors are related to some changes introduced in the version 3 of AWS provider. I thought I fixed these issues but looks like it's not the case.

Check the last version on master. Just deploy a new fix :-)

I just tested it with the following versions:

$ terraform -version
Terraform v0.14.3
+ provider registry.terraform.io/hashicorp/aws v3.24.0
+ provider registry.terraform.io/hashicorp/random v3.0.1

@haidaraM thank you very much, I tried exact same change you did before it didnt work, but I have changed few more things with that so that might have impaired the object of tolist. I am giving it another go will let you know the outcomes.

By the way, I have tried few more scenarios of Jenkins, your's is a bit different with Master calling agents via nlb. any specific benefit why we should have nlb instead of purely relying on the "jenkins ecs plugin"?

@haidaraM unfortunately still not going through as expected, I am getting this issue

Error: InvalidParameterException: The target group with targetGroupArn arn:aws:elasticloadbalancing:us-east-1:600908795746:targetgroup/krd-jenkins-develop-jnks-tg/552c41f56f0cb083 does not have an associated load balancer. "jenkins-master"



Error: error creating ELBv2 Listener: CertificateNotFound: Certificate 'arn:aws:acm:us-east-1:600908795746:certificate/5c515fcc-5ea4-4534-b75a-0c706a8bdb29' not found
        status code: 400, request id: 392e6d0e-6a5f-4b78-837a-70436ef71007

The acm cert is been created and validated though and the status of the cert is "issued"

Really strange. Did you have some others changes in the Terraform ? Can you start from scratch by deleting everything, git reset --hard and apply Terraform again.

To answer your question:
How did you manage to establish the communication from the agents to the master ? Does the agents communicate through the ALB to access the master or through master private IP or something else ? In my case, the ALB only handles HTTP and HTTPs.

We can technically do without the NLB by using the IP address of the Master as an access point for the agents. As this address changes each time a new container is started, a mechanism will have to be set up to automatically update the configuration of the Amazon ECS plugin with the new IP address of the Master.

the only change I have is I have set all my subnets to the public ones. I dont see any reason why that should impact the tg not getting attached to the loadbalancer.
for my other Jenkins case scenario same setting is working as expected.

Agree with you. Do you know which target group is concerned: nlb_agents_to_master_jnlp, nlb_agents_to_master_http or jenkins_master_tg ? Do you have the same error when you start from scratch ?

Just deployed in eu-west-1 a new Jenkins from scratch with an ACM certificate without any error.

@haidaraM apologies, that was my wrong, I forgot to uncomment #aws_lb_listener.master_https as below. all fixed now

resource "aws_ecs_service" "jenkins_master" {
.
.
depends_on = [
    aws_lb_listener.master_http,
    #aws_lb_listener.master_https,
    aws_lb_listener.agents_http_listener,
    aws_lb_listener.agents_jnlp_listener
  ]
.
.
}

All fixed and deployed with success, but the Jenkins is not running, I should be able to find out why
503 Service Temporarily Unavailable
the task is up and running, r53 record is pointing to the correct alb.

Ok perfect. Just wait a few seconds. Jenkins should be up and running.

I would still argue why target group isn't getting attached to the alb if we just have http listener?
yeah waited more than 5 mins still not coming up. there is no error in ecs task log as well.

The http listener does the target group. This is where the TG is attached to the ALB. This listener should be created before trying to attach the TG to the ECS service: this is the reason of the depends_on on the ECS Service.

If after 5mn you still don't have access to Jenkins, there is definitely something wrong. Check the health of the target groups if everything is fine in task log.

@haidaraM
you are right, the tg doesnt have any registered IP, I will should be able to troubleshoot that.

I am with you about depends on. my point is when I had issues with acm I opted out the https_listeners and only kept the http listerent. the question is why only http_listener was causing issues with tg not getting attached to alb? even though that depend_on was still there for http_listener?!

I see your point. There are two http listeners:

  • master_http: This one is used when you don't set the variable route53_zone_name. It also does the target group attachment.
  • master_http_redirect: This one is used when you set route53_zone_name (your case I guess) but doesn't do the attchment of the target group but only redirects http request to https. If this case, the attachment is managed by the master_https listener but you removed it if I understand.

Well described @haidaraM
but I removed both

  • master_http_redirect
  • master_https

and only kept

  • master_http
    I dont see any reason why it should cause issues for alb to get attached to alb.

By the way I found out the issue why Jenkins isn't up. for some reason my ecs isn't able to pull the image from my docker_hub repo, neither your docker hub repo

Stopped reason CannotPullContainerError: inspect image has been retried 5 time(s): failed to resolve ref "docker.io/elmhaidara/jenkins-aws-fargate:latest": failed to do request: Head https://registry-1.docker.io/v2/elmhaidara/jenkins-aws-fargate/manifests/latest: dia...

I set assign_public_ip = false, that would have been the reason why it wasnt able to pull the image.

I was checking a wrong Jenkins cluster when I said the task is up and running and there is no error in the log. that was for a different Jenkins.

All worked as expected. thank you very much for your help. I am gonna compare your architected way of Jenkins with another one I have it. your's is quite interesting though, I am keen to see the difference.

Perfect! Going to close the issue and thanks for reporting it :-)