awslabs/ami-builder-packer

Build failing post build_phase

cloudpayload opened this issue · 7 comments

Hi,

Trying to run ami-builder-packer CFT in AWS US-WEST2 region.

Kindly assist as code build fails at Post Build steps, I have tried multiple times and error seems to be constant( At the same stage)

Thanks,
Saurabh
_###
[Container] 2018/02/02 22:24:57 Phase complete: BUILD Success: true
[Container] 2018/02/02 22:24:57 Phase context status code: Message:
[Container] 2018/02/02 22:24:57 Entering phase POST_BUILD
[Container] 2018/02/02 22:24:57 Running command egrep "${AWS_REGION}:\sami-" build.log | cut -d' ' -f2 > ami_id.txt

[Container] 2018/02/02 22:24:57 Running command test -s ami_id.txt || exit 1

[Container] 2018/02/02 22:24:57 Command did not exit successfully test -s ami_id.txt || exit 1 exit status 1
[Container] 2018/02/02 22:24:57 Phase complete: POST_BUILD Success: false_

Thanks for replying,
Please find below the build output message.
One EC2 instance was launched via packer build,I have confirmed that from console.

LogFile.txt

Thanks,
Saurabh

I have tested at at N. Virgnia (AWS us-east-1) but still getting the same error.
LogFile 2.txt

The instance is getting launched and failing at post build after below command.
"Command did not exit successfully test -s ami_id.txt || exit 1 exit status 1

Not sure but does ec2 size plays a role here. I am using T2 micro.

Thanks,
Saurabh

@cloudpayload Did you get this working? I saw hashicorp/packer#4623 in the Packer repository, which makes sense to me - in that, Packer's ansible_local provisioner is unable to hold on to a trafficless SSH session for longer than a few seconds, as defined by the EC2 instance's sshd_config. I haven't tested the fix, but the logic passes muster in my mind.

My apologies on the long delay here - I set some time in my calendar to build one project from scratch.

This line dos look suspicious to me though as that does relate to @ajlanghorn link to Packer issue:

 >>>>> AWS AMI Builder - CIS: TASK [anthcourtney.cis-amazon-linux : 3.6.2 - Ensure default deny firewall policy(DROP INPUT)] ***
AWS AMI Builder - CIS: �[0;33mchanged: [127.0.0.1] => (item=INPUT)�[0m
AWS AMI Builder - CIS: �[0;33mchanged: [127.0.0.1] => (item=FORWARD)�[0m
==> AWS AMI Builder - CIS: Terminating the source AWS instance...
==> AWS AMI Builder - CIS: Cleaning up any extra volumes...
==> AWS AMI Builder - CIS: No volumes to clean up, skipping
==> AWS AMI Builder - CIS: Deleting temporary security group...
==> AWS AMI Builder - CIS: Deleting temporary keypair...
Build 'AWS AMI Builder - CIS' errored: Error executing Ansible: Non-zero exit status: 2300218

This seems to set a DROP policy in INPUT of which could essentially close an established SSH connection (Packer on CodeBuild Container <==> EC2 SSH instance) which it'd explain the issue.

If you can @cloudpayload, here's what I'd do as I create a project from scratch:

  • Add 3.6.2 policy under playbook.yaml VARS as you can use that to ignore certain rules that may not apply here

Excerpt of how that playbook.yaml should look like:

---
- hosts: localhost
  connection: local
  gather_facts: true    # gather OS info that is made available for tasks/roles
  become: yes           # majority of CIS tasks require root
  vars:
    # CIS Controls whitepaper:  http://bit.ly/2mGAmUc
    # AWS CIS Whitepaper:       http://bit.ly/2m2Ovrh
    cis_level_1_exclusions:
    # 3.4.2 and 3.4.3 effectively blocks access to all ports to the machine
    ## This can break automation; ignoring it as there are stronger mechanisms than that
    ## Based on issue #3, adding 3.6.2 as it adds a default INPUT DROP policy in ipt
      - 3.4.2 
      - 3.4.3
      - 3.6.2 
    # Cloudwatch Logs will be used instead of Rsyslog/Syslog-ng
    ## Same would be true if any other software that doesn't support Rsyslog/Syslog-ng mechanisms
      - 4.2.1.4
      - 4.2.2.4
      - 4.2.2.5
    # Autofs is no longer installed and we need to ignore it or else will fail
      - 1.1.19

@ajlanghorn and @cloudpayload -- As suspected it was due to 3.6.2 task that was introduced in the latest version of that Ansible Role (CIS) as well as another task that triggered an issue with Packer Ansible Local (5.3.3).

I've just submitted and merged a PR that fixes both of them and builds are now succeeding consistently

You can find more details as to why they do at the link below:

Ansible Role we depend on has added additional CIS checks: anthcourtney/ansible-role-cis-amazon-linux@240c59f