haxorof/ansible-role-docker-ce

Fails on Fresh Run: haxorof.docker-ce : Restart auditd

Closed this issue · 7 comments

I'm not sure if I'll have time to dig into this right away so I'm adding this issue.

Version Information

ansible [core 2.17.1]
  config file = /Users/jjackson/projects/hudx/vm-provisioner/ansible/ansible.cfg
  configured module search path = ['/Users/jjackson/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /Users/jjackson/projects/hudx/vm-provisioner/ansible/venv/lib/python3.12/site-packages/ansible
  ansible collection location = /Users/jjackson/.ansible/collections:/usr/share/ansible/collections
  executable location = /Users/jjackson/projects/hudx/vm-provisioner/ansible/venv/bin/ansible
  python version = 3.12.4 (main, Jun  6 2024, 18:26:44) [Clang 15.0.0 (clang-1500.3.9.4)] (/Users/jjackson/projects/hudx/vm-provisioner/ansible/venv/bin/python3.12)
  jinja version = 3.1.4
  libyaml = True
roles:
  # this version is based on https://github.com/haxorof/ansible-role-docker-ce/pull/173
  #  once this is released to ansible galaxy, go back to `src: haxorof.docker-ce` and whatever
  #  the new version number becomes.
  - name: haxorof.docker-ce
    src: https://github.com/haxorof/ansible-role-docker-ce
    version: e890e75ed46c89dccad39b6ef3542889e2b92754

Steps to Reproduce

  • Run role on amazon linux 2023 (fails)
  • Run role again on amazon linux 2023 (succeeds)

Expected Behavior

Should run successfully on both first run and subsequent runs.

Actual Behavior

Fails on first run on a fresh instance.

RUNNING HANDLER [haxorof.docker-ce : Restart auditd] ****************************************************************************************
fatal: [hudx-loadtest]: FAILED! => {"changed": true, "cmd": ["service", "auditd", "restart"], "delta": "0:00:02.166902", "end": "2024-07-10 15:46:45.109659", "msg": "non-zero return code", "rc": 1, "start": "2024-07-10 15:46:42.942757", "stderr": "Job for auditd.service failed because the control process exited with error code.\nSee \"systemctl status auditd.service\" and \"journalctl -xeu auditd.service\" for details.", "stderr_lines": ["Job for auditd.service failed because the control process exited with error code.", "See \"systemctl status auditd.service\" and \"journalctl -xeu auditd.service\" for details."], "stdout": "Stopping logging: \u001b[60G[\u001b[1;32m  OK  \u001b[0;39m]\r\nRedirecting start to /bin/systemctl start auditd.service", "stdout_lines": ["Stopping logging: \u001b[60G[\u001b[1;32m  OK  \u001b[0;39m]", "Redirecting start to /bin/systemctl start auditd.service"]}
        to retry, use: --limit @/Users/jjackson/projects/hudx/vm-provisioner/ansible/playbook.retry

References

  • I'm leveraging @palyla's contribution, hot of the presses. (Great timing, @palyla, thanks.)

Thanks for reporting this. I have not yet put the Amazon Linux into any regression testing suite and has not run through the regression testing for all distributions for some time now. And since then I also considering removing some parts like device mapper which is now dropped completely in later Docker Engine release.

Might be able to do a run here the upcoming weekend since it takes several hours to run though the test matrix even if it is automatic.

@jamiejackson what errors did you get related to auditd in your journal logs (journalctl -u auditd)? I ran through regression test for auditd configuration and it works with ansible-core 2.16.8 and 2.17.1. When I ran regression test now I used an Vagrant box that exists with Amazon Linux 2023.

# note: 1.27.0 introduced this problem: https://github.com/docker/compose/issues/7839
docker_compose_ver: 1.26.2
docker_enable_audit: true
docker_enable_ce_edge: false
docker_enable_mount_flag_fix: false
docker_enable_swarm: true
docker_daemon_config:
  # bip gets docker0 to have a consistent ip. need that to access host's
  # postfix from the container: 
  # https://redacted/browse/CPD-10025?focusedCommentId=394243
  bip: 172.17.0.1/16
  debug: false
  # https://github.com/moby/moby/issues/12886#issuecomment-545572503
  experimental: true
  features:
    buildkit: true
  icc: false
  init: true
  live-restore: false
  log-driver: json-file
  log-opts:
    max-size: "50m"
    max-file: "5" 
    compress: "true"
  storage-driver: overlay2
  userland-proxy: false
# to figure out available versions: `yum --showduplicates list docker-ce`
docker_version: '26.1.4-1.el9'
[ansible@ip-10-103-1-134 ~]$ sudo journalctl -u auditd
Jul 16 21:17:48 ip-10-103-1-134.ec2.internal auditd[2423]: The audit daemon is exiting.
Jul 16 21:17:48 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Deactivated successfully.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43207]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43207]: The audit daemon is exiting.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 1.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43208]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43208]: The audit daemon is exiting.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 2.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43226]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43226]: The audit daemon is exiting.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 3.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43227]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43227]: The audit daemon is exiting.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 4.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43228]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43228]: The audit daemon is exiting.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 5.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Start request repeated too quickly.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
[ansible@ip-10-103-1-134 ~]$ sudo journalctl -u auditd
Jul 16 21:17:48 ip-10-103-1-134.ec2.internal auditd[2423]: The audit daemon is exiting.
Jul 16 21:17:48 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Deactivated successfully.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43207]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43207]: The audit daemon is exiting.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 1.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43208]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43208]: The audit daemon is exiting.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 2.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43226]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal auditd[43226]: The audit daemon is exiting.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:50 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 3.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43227]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43227]: The audit daemon is exiting.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 4.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Starting auditd.service - Security Auditing Service...
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43228]: Could not open dir /var/log/audit (No such file or directory)
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal auditd[43228]: The audit daemon is exiting.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Control process exited, code=exited, status=6/NOTCONFIGURED
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Scheduled restart job, restart counter is at 5.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Stopped auditd.service - Security Auditing Service.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Start request repeated too quickly.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: auditd.service: Failed with result 'exit-code'.
Jul 16 21:17:51 ip-10-103-1-134.ec2.internal systemd[1]: Failed to start auditd.service - Security Auditing Service.

Have you tried to start auditd service before even using my role? For me it look you have some issue related to /var/log or if there is some permissions issue. What the role does is just starting auditd and creates configuration file with some docker related rules.

Sorry. False alarm.

There had been a preliminary set of tasks in my playbook that affected the /var/log directory in a way that caused an initial auditd issue.