SensorsIot/IOTstack

Telegraf docker error - just me ?

Opened this issue · 4 comments

I saw this error in my logs:
2022-11-09T14:31:00Z E! [inputs.docker] Error in plugin: Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?filters=%7B%22status%22%3A%7B%22running%22%3Atrue%7D%7D&limit=0": dial unix /var/run/docker.sock: connect: permission denied

after some investigation I saw this
docker/compose#1532 (comment)

Is this something to do with file permissions pls ?

I ended up with a compose as below which seems to work.. The user is the pertinent part.

    container_name: telegraf
    build: ./.templates/telegraf/.
    image: telegraf:latest
    restart: unless-stopped
    user: telegraf:998
    environment:
    - TZ=Etc/UTC
    - HOST_ETC=/hostfs/etc
    - HOST_PROC=/hostfs/proc
    - HOST_SYS=/hostfs/sys
    - HOST_VAR=/hostfs/var
    - HOST_RUN=/hostfs/run
    - HOST_MOUNT_PREFIX=/hostfs
    ports:
    - "8092:8092/udp"
    - "8094:8094/tcp"
    - "8125:8125/udp"
    volumes:
    - ./volumes/telegraf:/etc/telegraf
    - /var/run/docker.sock:/var/run/docker.sock:ro
    - /:/hostfs:ro
    depends_on:
    - influxdb

Well, here's my service definition:

  telegraf:
    container_name: telegraf
    build: ./.templates/telegraf/.
    hostname: iotstack
    restart: unless-stopped
    environment:
      - TZ=Australia/Sydney
    ports:
      - "8092:8092/udp"
      - "8094:8094/tcp"
      - "8125:8125/udp"
    volumes:
      - ./volumes/telegraf:/etc/telegraf
      - /var/run/docker.sock:/var/run/docker.sock:ro
    depends_on:
      - influxdb
      - mosquitto

I just did a clean-slate install:

  • terminate the container
  • remove the image
  • erase the persistent store
  • up the container (force a rebuild and complete re-initialisation)

The result in the log:

$ docker logs telegraf
2022-11-09T23:23:39Z I! Using config file: /etc/telegraf/telegraf.conf
2022-11-09T23:23:39Z I! Starting Telegraf 1.24.3
2022-11-09T23:23:39Z I! Available plugins: 221 inputs, 9 aggregators, 26 processors, 20 parsers, 57 outputs
2022-11-09T23:23:39Z I! Loaded inputs: cpu disk diskio docker file kernel mem processes swap system
2022-11-09T23:23:39Z I! Loaded aggregators: 
2022-11-09T23:23:39Z I! Loaded processors: 
2022-11-09T23:23:39Z I! Loaded outputs: influxdb
2022-11-09T23:23:39Z I! Tags enabled: host=iotstack
2022-11-09T23:23:39Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"iotstack", Flush Interval:10s

Metrics are also turning up in InfluxDB so that's working too.

I'm not saying this is the answer but problems involving docker.sock usually turn out to be incomplete installation of Docker in the Raspbian environment and, in particular, not having done:

$ sudo usermod -G docker -a $USER
$ sudo usermod -G bluetooth -a $USER
$ sudo reboot

You can also get away with just a logout and login rather than the reboot

To the best of my recollection, the current user not being a member of group docker is something that shows up at docker-compose time rather than the container seeming to come up OK but moaning internally. You didn't mention anything like that happening so that's quite puzzling.

The mechanism by which the current user gains access to docker.sock by being a member of the docker group is (to my eye) fairly straightforward:

$ ls -al /var/run/docker.sock
srw-rw---- 1 root docker 0 Nov  3 12:09 /var/run/docker.sock

members of docker get rw access

How your user statement solves the problem is something I can't explain. On my system, there is no telegraf in /etc/passwd and group 998 is i2c which pi (my $USER) is a member of. As I read the permissions above, you either need to be root or a member of docker:

$ grep docker /etc/group
docker:x:995:pi

So, on my system at least, pi is the only member of docker.

Beats me!

Anyway, perhaps try the usermod commands, comment-out the user clause, and see what happens.

Just as an experiment, I removed the current user from the docker group, logged out/in to let it take effect, and tried to bring up telegraf:

$ docker-compose up -d telegraf
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/json?all=1&filters=%7B%22label%22%3A%7B%22com.docker.compose.project%3Diotstack%22%3Atrue%7D%7D": dial unix /var/run/docker.sock: connect: permission denied

That's the same error but it lacks the "2022-11-09T14:31:00Z E! [inputs.docker] Error in plugin: Got " preamble which shows yours is coming from docker logs telegraf.

Of course, not being a member of docker pretty much prevents anything from working unless I use sudo.

When I try:

$ sudo docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES

my container hasn't even come up. I suppose I could try to follow-through using sudo on everything but I don't want to do that because it just creates myriad other problems which I'll then have to unpick.

So, have you also been using sudo to run docker and docker-compose commands? That should never be needed. Maybe read this for some context.

If this combination (not being in docker group and using sudo to run docker commands) does actually turn out to explain your problem, please let me know. I'll add some extra words to that doco page to emphasise that needing to use sudo in that situation indicates a deeper problem.

I still can't explain how user: telegraf:998 solves this…