influxdata/influxdata-docker

Telegraf container - logtarget/logfile not working when using input.docker with special user/group ID

funky28 opened this issue · 5 comments

Hi, using the latest Telegraf image as (1.27.1) on a Ubuntu 22.04 system with docker 24.0.2 I can get Telegraf to connect, and start using a very simple config file... no issues there. When I enable the docker input, as explained on here one needs to pass the host group ID for the docker.socket into the container. Telegraf can report on container stats, no issues either.

--user telegraf:$(stat -c '%g' /var/run/docker.sock)

However, when I pass the host group ID as specified above on my docker run command, if I set up the telegraf config to parse the logs into a file:

  ## Log target controls the destination for logs and can be one of "file",
  ## "stderr" or, on Windows, "eventlog".  When set to "file", the output file
  ## is determined by the "logfile" setting.
  logtarget = "file"

  ## Name of the file to be logged to when using the "file" logtarget.  If set to
  ## the empty string then logs are written to stderr.
  logfile = "/var/log/telegraf/telegraf.log"

The following error shows up:
2023-06-30T19:07:27Z E! Unable to open /var/log/telegraf/telegraf.log (open /var/log/telegraf/telegraf.log permission denied), using stderr

I tried playing with adding different group ID with the --group-add command on docker run to try and macht it to known groups and users on my host and nothing worked.

The only way I could get the local file logging to work was by not specifying the group ID to be able to read /var/run/docker.sock and leaving my docker run commands to just the docker image and configuration (folder location) mount.

docker run -d --name=telegraf \
      -v /var/run/docker.sock:/var/run/docker.sock \
      -v $PWD/telegraf_container:/etc/telegraf/ \
      telegraf/latest

I tried searching on google and the GitHub, but nothing shows up for this particular problem.

Has anyone had this issue before? any ideas on how to fix it (possible without having to create my own dockerfile/container)

Thanks!

  -v $PWD/telegraf_container:/etc/telegraf/ \

This is not what I would expect. By default telegraf will try reading from /etc/telegraf/telegraf.conf Are you also setting something to change the folder?

If I run:

docker run -it --rm \
    --user telegraf:$(stat -c '%g' /var/run/docker.sock)  \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /tmp/test/telegraf.conf:/etc/telegraf/telegraf.conf \
    telegraf

Then if I jump into the container:

telegraf@f452c9284eb1:/$ ls -l /var/log/telegraf/
total 4
-rw-r--r-- 1 telegraf 961 564 Jun 30 19:31 telegraf.log

I see the file get created.

The folder and file are owned the user telegraf and the group telegraf as well.

Hi, thanks for the super quick reply. I did not mount the config file directly, but rather the directory on the host in which I have the config (in my tests, either way works, pointing to the file, or the folder with the file). So, either doing:
--volume $PWD/telegraf_container/telegraf.conf:/etc/telegraf/telegraf.conf \
or
--volume $PWD/telegraf_container:/etc/telegraf/ \
works fine for my set up.

However, I just noticed I failed to add a key piece of information on my original post (sorry, for that... tried to copy paste too quick), the reason I am trying to put the logs on a file is so that I can retrieve them from the host, so I am mapping the /var/log/telegraf folder to a folder on my host so that I can retrieve the log files and inspect them:
--volume $PWD/telegraf_container/log:/var/log/telegraf \

You are correct, and the example you provided does create the log file, and I can see it. However, with my desired set up it gives the same error (mapping the /var/log/telegraf to the host)
2023-06-30T19:07:27Z E! Unable to open /var/log/telegraf/telegraf.log (open /var/log/telegraf/telegraf.log permission denied), using stderr
.. furthermore running ls -l on the /var/log/telegraf shows nothing as expected, but running the same command one level above (/var/log) shows:

telegraf@Docker_Telegraf_Latest:/$ ls -l /var/log/         
total 360
-rw-r--r-- 1 root root    326 Jun 13 03:29 alternatives.log
drwxr-xr-x 1 root root   4096 Jun 13 19:52 apt
-rw-rw---- 1 root utmp      0 Jun 12 00:00 btmp
-rw-r--r-- 1 root root  26444 Jun 21 22:20 dpkg.log
-rw-r--r-- 1 root root  32000 Jun 21 22:20 faillog
-rw-rw-r-- 1 root utmp 292000 Jun 21 22:20 lastlog
drwxrwxr-x 2 1000 1000   4096 Jun 30 19:51 telegraf
-rw-rw-r-- 1 root utmp      0 Jun 12 00:00 wtmp

Now, if I re-run the container without mapping the /var/log/telegraf folder to the host, and I run ls -l

telegraf@Docker_Telegraf_Latest:/var/log/telegraf$ ls -l /var/log
total 360
-rw-r--r-- 1 root     root        326 Jun 13 03:29 alternatives.log
drwxr-xr-x 1 root     root       4096 Jun 13 19:52 apt
-rw-rw---- 1 root     utmp          0 Jun 12 00:00 btmp
-rw-r--r-- 1 root     root      26444 Jun 21 22:20 dpkg.log
-rw-r--r-- 1 root     root      32000 Jun 21 22:20 faillog
-rw-rw-r-- 1 root     utmp     292000 Jun 21 22:20 lastlog
drwxr-xr-x 1 telegraf telegraf   4096 Jun 30 20:00 telegraf
-rw-rw-r-- 1 root     utmp          0 Jun 12 00:00 wimp

Now I can see what the issue is, the permissions are not the same!

telegraf@Docker_Telegraf_Latest:/var/log/telegraf$ id -u telegraf
999

So... now I see that the problem is not with the config or the location of the log file, but rather the fact that I want to expose the /var/log/telegraf to the host... and I have a mismatch with my permissions.

Any thoughts on how to get around that?

However, I just noticed I failed to add a key piece of information on my original post (sorry, for that... tried to copy paste too quick), the reason I am trying to put the logs on a file is so that I can retrieve them from the host, so I am mapping the /var/log/telegraf folder to a folder on my host so that I can retrieve the log files and inspect them

Ahhh ok that makes more sense now :)

Now I can see what the issue is, the permissions are not the same!

Right the uid/gid's are passed in, for the same reason you have to set the group of the telegraf user to see the docker socket.

Any thoughts on how to get around that?

hmm I am not sure I know enough Docker options. My first thought is to possibly use a custom dockerfile that you could use to match the IDs

After taking a time away from the computer and thinking more about this, I realized I have encountered this issue in other containers I have created, and the solution that i find almost everywhere was to match my host/user/folder-owner to that of the user passed onto the container via the --user tag... in some instances I have passed the group as well.

I realize the issue is that for the docker input to work, I have to pass the docker socket GID. and then my host folder owner GID... which I thought I could do by adding the --group-add tag and matching the 999 GID that the Telegraf user inside the container uses.. but that did not work.

So, I decided to go the nuke way, I granted the same folder owner and group owner to the folder on the host that I want the mount to be, and matched that to what the telegraf container user is (that is, 999:999). So, on the host I ran:

sudo chown -R 999:999 '$PWD/telegraf_container/log'

And recreated the container, now the log files is created inside the container in the correct location (/var/log/telegraf) and on my host I can see it. My user is not the owner, but at least I can open it and inspect it.

Not the best solution in my opinion, but it works.. hopefully someone else has another perspective, otherwise we can close this one.

Thanks for all of the help!

hopefully someone else has another perspective, otherwise we can close this one.

Thanks for following up with what you ended up doing. Like you said, not the best, but due to the nature of how uid and gid are handled between host and container I'm not clear on other solutions.

I'll close this for now, but if others have better methods or paths, please feel free to add them.