Can't start Grafana on Kubernetes 1.7.14, 1.8.9, or 1.9.4
dghubble opened this issue ยท 27 comments
Kubernetes just patched releases to enforce that volumes mounted from ConfigMaps are read-only kubernetes/kubernetes#58720 (deplorable that this happened in a patch release). Grafana attempts to chown its data directory which might ordinarily be fine, but users are supposed to mount dashboard configs in there too. As a result, on these Kubernetes clusters, Grafana can't start:
chown: changing ownership of '/var/lib/grafana/dashboards/kubernetes-resource-requests-dashboard.json': Read-only file system
...
Hi @dghubble,
Thanks for reporting this issue. We have been planning on reworking the grafana docker image to not chown the directories as well as support configuring what user to run as. Hoping to get started on it soon.
We've hit this issue a while ago, and are building our own image. I realize the way we build it, might not be working for everyone, but I think there's little work that needs to go into it to do that.
This is where we build our image from: https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus/grafana-image
@bergquist we've talked about this one before ๐, let's make it happen and get this container not to do this.
For the time being, I rolled Grafana back to v4.6.3 since the set of mounts was different and this issue doesn't occur on Kubernetes v1.9.4. So that was the temporary fix. I'd like to stay on the official grafana image and get to v5.x.y whenever this is resolved.
Ran into the same problem. Worked around it by setting the Grafana container command to:
["gosu", "grafana", "/usr/sbin/grafana-server", "--homepath=/usr/share/grafana", "--config=/etc/grafana/grafana.ini", "cfg:default.log.mode=console", "cfg:default.paths.data=/var/lib/grafana", "cfg:default.paths.logs=/var/log/grafana", "cfg:default.paths.plugins=/var/lib/grafana/plugins", "cfg:default.paths.provisioning=/etc/grafana/provisioning"]
(taken from the run.sh
script).
If you need a gist of run.sh that works you can use or copy this.
https://gist.github.com/kavehmz/61419af3ddc685b18553c05299d78c9d
(Only if the simpler command that @wieslaw-gat mentioned didn't work for you.)
I'm working on a new image to solve the issues mentioned here as well as some others. The work is happing on this branch: https://github.com/grafana/grafana-docker/tree/image-improvements
@brancz Thanks for sharing your image. I'm using it as a base for re-doing the default image. Regarding user to run Grafana as, do you feel its better to use the nobody user or to create a grafana user with a high (and pinned) id instead. I have yet to try using the slim debian image as a base. Have you had any issues with it?
@wieslaw-gat, @kavehmz: thanks for sharing your workarounds.
I've just merged PR #142 which should act as a temporary fix to the issues with chown:ing while we continue working on the new image (chown
errors are ignored). I don't have a Kubernetes cluster setup so I would love to hear if this solves your issues. There is currently no published image on dockerhub with this fix but the next build of master should include it in grafana/grafana:master
@xlson - using grafana/grafana:master
resolved the chown issues in my k8s cluster.
Thanks!
We've just released Grafana 5.0.4 with the fix, it's available from Docker Hub (grafana/grafana:5.0.4
). In 5.1 we will remove chown completely.
Thanks!
@siwyd We haven't planned for it but if it's requested we might do it. I presume your team haven't upgraded to 5 yet then?
@xlson No, but it's good to be pushed to do exactly that ;) Thanks for the consideration, but no need to on our account.
@xlson, have you by chance changed your plans and decided to back-port the fix to 4.x? thank you!
@zanitete Not yet, no one else has requested it. Are you stuck not being able to update?
If we were to do that we would definitely want to use semver build metadata https://semver.org/#spec-item-10 to avoid modifying old tags in ways that might break existing deployments
@xlson, let's say the update was not planned in the short term, but if nobody else requested it I can understand.
Thanks for the quick feedback! For the moment I built a custom image for 4.3.6 using the latest version of the Dockerfile and seems to work fine. Here are the small changes I made to the build script; if interested I can create a PR. zanitete@70026bd
@zanitete interesting. Could you describe exactly what your use case is? I presumed that you wanted a Docker build of 4.x grafana with the container from 5.0.4 or 5.1. But it seems like what you want is the 5.1 container with the ability to choose id/gid of Grafana at build time?
What I needed was a Docker image with Grafana 4.3.6 that would include the fix for the failing chown
command at startup, so that we could migrate to k8s 1.9 without having to migrate (now) to Grafana 5.1
@zanitete okay. How do the changes you have made in that commit play into that issue?
Because the volumes attached to existing deployments of Grafana 4.x expects grafana user/group id to be 104/107 so I needed to override the build args.
The changes to the build script are not strictly needed (I could have build the image without it), but if you want to use it to build a 4.x backward compatible version of the image you need to override the default UID/GID, no?
@zanitete you're quite right. That's not something we want to break when fixing the chown problem. Please send in the PR :)
I see, I had the same doubt but I tested the image built with the Dockerfile in master and it seems to work fine (at least for our limited usecases) Would you recommend to checkout https://github.com/grafana/grafana-docker/tree/df72f7243afda7de0fc30d0d10dc00243e152706 and build it from there instead? In this case my PR is not relevant since the build script in master would be used only to build 5.1 images, right?