awslabs/amazon-kinesis-agent

Unable to install and run kinesis agent in AL2023

Opened this issue · 2 comments

I followed instruction here to install kinesis agent using public.ecr.aws/amazonlinux/amazonlinux:minimal as base image. Here is my Dockerfile:

`FROM public.ecr.aws/amazonlinux/amazonlinux:minimal

RUN dnf update
&& dnf --best install -y
shadow java-1.8.0-amazon-corretto telnet aws-kinesis-agent which findutils procps systemd
&& dnf clean all

SHELL ["/bin/bash", "-c"]
RUN chkconfig aws-kinesis-agent on`

However, I don't see logs being produced at /var/log/aws-kinesis-agent and I don't see an agent running either in the container. Any idea why this doesn't work?

g-dx commented

We faced the inability to run the agent on AL2023 and tracked it down to a missing group.

When starting the agent via systemctl we saw the following output:

Sep 20 09:48:12 application systemd[1]: Starting aws-kinesis-agent.service - LSB: Daemon for Amazon Kinesis Agent....
Sep 20 09:48:12 application aws-kinesis-agent[2526]: install: invalid group ‘aws-kinesis-agent-user’
Sep 20 09:48:12 application aws-kinesis-agent[2506]: /etc/rc.d/init.d/aws-kinesis-agent: line 195: /var/run/aws-kinesis-agent/mutex: >
Sep 20 09:48:12 application aws-kinesis-agent[2528]: flock: 200: Bad file descriptor
Sep 20 09:48:12 application systemd[1]: aws-kinesis-agent.service: Control process exited, code=exited, status=1/FAILURE
Sep 20 09:48:12 application systemd[1]: aws-kinesis-agent.service: Failed with result 'exit-code'.
Sep 20 09:48:12 application systemd[1]: Failed to start aws-kinesis-agent.service - LSB: Daemon for Amazon Kinesis Agent..
Sep 20 09:48:13 application systemd[1]: aws-kinesis-agent.service: Start request repeated too quickly.
Sep 20 09:48:13 application systemd[1]: aws-kinesis-agent.service: Failed with result 'exit-code'.
Sep 20 09:48:13 application systemd[1]: Failed to start aws-kinesis-agent.service - LSB: Daemon for Amazon Kinesis Agent..

The important line is this one:

install: invalid group ‘aws-kinesis-agent-user’

User Private Groups are disabled in AL2023. This can be confirmed by checking /etc/login.defs

# Enables userdel(8) to remove user groups if no members exist.
#
USERGROUPS_ENAB no

This prevents automatic creation of a group when the aws-kinesis-agent-user is created at RPM install time.

So either:

  1. Create a group of the same name aws-kinesis-agent-user or
  2. Set USERGROUPS_ENAB yes before installing the aws-kinesis-agent package.

We chose option 2, re-installed the agent and it now runs successfully.

g-dx commented

SIDE NOTE: Annoyingly, the install command which fails in the command above cannot be found in the shell script in this repo → https://github.com/awslabs/amazon-kinesis-agent/tree/master/bin

Clearly the files here in Github do not contain the latest changes from what is being used internally at AWS.