aws/aws-cdk

ECS: Service should autoconfigure security groups when attached to NLB

rix0rrr opened this issue ยท 36 comments

Right now in ELBv2, security groups are configured for Application Load Balancers, but not for Network Load Balancers.

It seems to be that NLBs don't have security groups themselves (verify), but in that case the target should be autoconfiguring its own SGs correctly upon the LB being attached to them.

@karlpatr has sensible things to say about this in #3667, we should investigate and follow up.

@rix0rrr NLBs don't have security groups, no? Access is supposed to be controlled by the resource behind the NLB. Or am I missing something?

I thought as much, but apparently they do. We get this issue reported a lot these days. Maybe it changed.

Oh I see what's going on, it needs to configure that SGs of the Target

Just encountered with this problem this morning and took me quite a while to realize that it was the default security group created for ECS service didn't allow NLB to connect.
For now I have to create an SG separately for the ECS service and allow any ip to connect, would be great if it works out of the box.

I think the reason this comes up often is that ALB security is handled implicitly during attachment in BaseService (easier since there is a known security group to use) and people want NLB to be just as easy to use. I'm not sure people realize that the origin of the traffic also must be granted access explicitly since that's not true for ALBs, and I'm not sure if there is a one-size-fits-all case that can be inferred at template time since it's not necessary to register NLB callers.

In addition, the health check for an NLB currently needs to be explicitly granted access before it can work properly (see note under recommended rules here for scope: https://docs.aws.amazon.com/elasticloadbalancing/latest/network/target-group-register-targets.html). This tripped me up as I was building since I knew to link the caller SG but had to research where the health check was coming from.

I would not want to see CDK take a posture of weaker default security for convenience, but many people seem to arrive at the shotgun approach of adding any ip to the target ingress to bypass VPC security because it covers both the health check and intended audience cases.

I do not want the NLB to be accessible from the internet, but obviously the Healthcheck must work. The use case I have is that I want to access the NLB from an api gateway vpc link eg.

new apigateway.VpcLink(this, 'MyVPCLink', {
targets: [myFargateService.loadBalancer],
});

The solution should cater for this use case too.

Can someone bottom line this for me? Are the following statements true or false:

  • The Security Group of the target (let's say the ECS service) must accept traffic from the NLBs subnet (by CIDR?) in order for health checks to work.
  • The Security Group of the target (let's say the ECS service) must specify the traffic source it wants to accept requests from.

Is that an accurate summary?

Both statements are true -- opening the target security group to the VPN CIDR is the only way I've found to allow the first one in CDK without multiple creation passes, but the actual required scope is narrower.

I had thought about proposing a proxied security interface to "allow access to the NLB" and have definitions attach to the target instead as a late resolution step; I think that meets user expectations but obfuscates the reality and diverges from what a user would find after template creation.

As long as the documentation is clear - obfuscating templates from CDK code is fine in my opinion. A few examples of how you would set up various use cases would be all I as a user would be after. Eg. Setting up a NLB with Port 80 open, setting up a TLS listener, Setting up a NLB with a VPC link

pahud commented

Public-facing NLB fronted fargate tasks with assignPublicIp enabled should allow all ipv4 ingress traffic.

I am submitting a PR(#4825) with this implementation FYR.

Both statements are true -- opening the target security group to the VPN CIDR is the only way I've found to allow the first one in CDK without multiple creation passes, but the actual required scope is narrower.

Agree. The best approach to deal with the problem might be looking up the NLB subnet CIDRs and then setting up inbound rule for target SG allowing ingress traffic for those CIDRs. However, we can only know the CIDRs of the NLB subnets in deploy stage.

Can somebody give me a hint on this, how to work around this issue for the moment? I have a network load balanced fargate service that's supposed to run behind an API gateway (NLB not public). After some digging I found that the security group of the service does now allow traffic from the NLB (as described).

Can somebody give me a hint on this, how to work around this issue for the moment? I have a network load balanced fargate service that's supposed to run behind an API gateway (NLB not public). After some digging I found that the security group of the service does now allow traffic from the NLB (as described).

  1. Use this approach (opening the port for ingress traffic for every IP though the actual required scope of IPs is narrower).
  2. Manually looking for the NLB subnet IPv4 CIDRs and whitelist them when setting rule for ingress traffic for the target security group.

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

Is this not relevant anymore?

@sebastian-schlecht I believe it is. Reopening.

This issue is still ongoing.

To reproduce this issue I have created a simple Dockerfile for the Fargate service to host:
FROM nginx:alpine
COPY index.html /usr/share/nginx/html/index.html

I am then creating the ecs_patterns.NetworkLoadBalancedFargateService:

    new ecs_patterns.NetworkLoadBalancedFargateService( this, "Service-Name-Stuff", {
        assignPublicIp: props.isPublic,
        cluster: this.cluster,
        cpu: props.cpuUnits != undefined? props.cpuUnits : 256, // set it to the default 256 if not defined
        desiredCount: props.instanceCount != undefined? props.instanceCount : 1, // only create one instance if not defined
        domainName: props.domainName,
        domainZone: props.hostedZone,
        listenerPort: props.listenerPort,
        memoryLimitMiB: props.memoryLimitMB != undefined? props.memoryLimitMB : 512, // default to 512
        publicLoadBalancer: props.isPublic,
        serviceName: props.name,
        taskImageOptions: { 
            image:containerImage,
            enableLogging: true,
            containerPort: props.containerPort // specify the container port that traffic should be directed to
        },
    });

The Service that is created has a Load Balancer associated with it that I can see in the console. I can see that is is auto-assigning a public IP address. I can see my A record in the Route53 records and I can follow that to the Listener on TCP:80 routing to the FargateService which in the targets has an IP address with port 80. If I look at the SecurityGroup, however, there are no inbound rules so nothing can tell that it is actually running so I'm constantly draining the old instances and creating new ones when the initial healthcheck timeout expires.

If I move this same service over to using an ALB I get two SecurityGroups created. One that listens on 443 and another that listens on port 80. These securitygroups open to 0.0.0.0/0 which is a substantial difference from what happens on the NLB side of the fence.

The fix for this is similar to that contained within this push for the examples:
aws-samples/aws-cdk-examples#155

Essentially one can resolve this issue by performing an allow on the containerPort:

        nlbFgSvc.service.connections.allowFromAnyIpv4( ec2.Port.tcp(props.containerPort) );

In this scenario, I'm passing it into my construct and specifying that the service should allow anything to the containerPort since that is what the health check is trying to verify it is able to reach. This will allow the healthcheck to work and for the loadbalancer to not mark the instance as Unhealthy. I am calling this line directly the ecs_patterns.NetworkLoadBalancedFargateService(...) line.

The other suggestion is to configure security groups to allow from the CIDR block in the VPC... anyone got any example code to do this please?

It looks like there's another issue with some example code for achieving this:

...
let vpc = new EC2.VPC(...)
let service = new ECS.FargateService(...)

service.connections.allowFrom(
  Peer.ipv4(vpc.vpcCidrBlock),
  EC2.Port.allTraffic(), // Or whatever specific ports you want
  "Allow traffic from within the VPC to the service secure port"
)
...

Source: #9972

This seems like a pretty good solution already. To take it a step further, is it possible to limit the ingress IP addresses to just the load balancer?

NetworkLoadBalancedFargateService should behave in the same way that the ApplicationLoadBalancedFargateService does. I believe that if you're using these opinionated higher level pattern constructs, you expect that the service will behave in a specific and consistent way. i.e. Security Group on the Fargate Service should allow 0.0.0.0/0 traffic.

This wasn't obvious to me at first, since I normally use ALBs. At the very least, there should be a highlighted note in the documentation describing this behaviour for NetworkLoadBalancedEc2Service and NetworkLoadBalancedFargateService.

Also - where is assignPublicIp on NetworkLoadBalancedEc2Service?

Bump because this still doesn't work out of the box

Bump

I have the same problem, using the layer3 construct for NetworkLoadBalancedFargateService, any inbound security group is created, so for example fails health checks etc...

Hi guys!, Finally I get the security group configuring object service like this:

loadBalancedFargateService.service.connections.allowFromAnyIpv4(ec2.Port.tcp(3000))

In my case 3000 is the port of service, thats works, open the port to the ext and in my case begings works the healthchecks.

$ cdk --version
2.34.2 (build 7abcbc6)

It took so many hours to work out how to route two different container ports (80 & 443) via two different Network Load Balancer Target Groups with CDKv2 that I seriously considered reverting to YAML where I could just change the 3 characters to fix the port number ๐Ÿ˜„ Surely routing TCP port 80 and port 433 of an NLB to a container is not an uncommon use case? ๐Ÿค”
Anyway, here it is for the benefit of other coming here, and thank you to @pahud for the workaround idea!

@hardcode83 I tried using Service.Connections too but that approach doesn't support ECS/Fargate targets (only EC2 & IP), so I took @pahud's approach instead.

targetGroupPort80.AddTarget(new [] {
    service.LoadBalancerTarget(new LoadBalancerTargetOptions {
        ContainerName = myContainer.ContainerName,
        ContainerPort = 80,
        Protocol = Protocol.TCP
    }) 
});

service.Connections.SecurityGroups[0].AddIngressRule(
    Peer.Ipv4(myVpc.VpcCidrBlock),
    Port.Tcp(80),
    "Allow NLB traffic from VPC CIDR to port 80 Target Group"
);

targetGroupPort443.AddTarget(new [] {
    service.LoadBalancerTarget(new LoadBalancerTargetOptions {
        ContainerName = myContainer.ContainerName,
        ContainerPort = 443,
        Protocol = Protocol.TCP
    })
});

myService.Connections.SecurityGroups[0].AddIngressRule(
    Peer.Ipv4(myVpc.VpcCidrBlock),
    Port.Tcp(443),
    "Allow NLB traffic from VPC CIDR to port 443 Target Group"
);

ECS/Fargate services should automatically add the VPC or NLB subnet CIDR ingress rule to their Security Group when they made a target of an Network Load Balancer.

Also the default security group created by FargateService is missing a 'Name' tag with the stack path that other CDKv2-created objects get.

My organization's permissions boundaries prevents developers from performing ec2:CreateSecurityGroup either in the Management Console or via CDK, so I unfortunately don't have the ability to just ignore the default security group that gets created with NetworkLoadBalancedFargateService. :(

Does anyone know of a simple way to create a NetworkLoadBalancedFargateService that allows for 1) supplying an already existing security group and 2) preventing the default security group from being created?

@pygaissert if you supply a pre-made security group when you create the service, it suppresses the automatic creation of another security group.

var service = new FargateService(this, "MyFargateService", new FargateServiceProps {
    Cluster = myCluster,
    TaskDefinition = taskDef,
    SecurityGroups = new [] { myServiceSecurityGroup }
});
devnjw commented

AWS has announced that NLB now supports security groups. [announcement]

But it seems like CDK doesn't support it yet.

lb = elbv2.NetworkLoadBalancer(...)
sg = ec2.SecurityGroup(...)
clb = lb.node.default_child
clb.add_override("Properties.SecurityGroups", [sg.security_group_id])

This seems to add a security group to an NLB now.

^ How do you achieve the same thing in typescript?
What is the type of lb.node.default_child?

^ How do you achieve the same thing in typescript? What is the type of lb.node.default_child?

any :D

This one threw me for a loop...

Working around it with:

const nlbFargateService = new NetworkLoadBalancedFargateService(this, 'nlbservice', {
  vpc: this.vpc,
  <snipped>
});

nlbFargateService.loadBalancer.addSecurityGroup(
  new ec2.SecurityGroup(this, 'LoadBalancerSG', { vpc: this.vpc, allowAllOutbound: false }),
);

nlbFargateService.loadBalancer.connections.allowFrom(ec2.Peer.ipv4(this.vpc.vpcCidrBlock), ec2.Port.allTraffic());

nlbFargateService.service.connections.allowFrom(nlbFargateService.loadBalancer.connections, ec2.Port.allTraffic());