Yolean/kubernetes-kafka

mkdir: cannot create directory '/opt/kafka/bin/../logs': Permission denied

novakov-alexey-zz opened this issue · 4 comments

Environment: AWS, OpenShift

What I do?

kubectl apply -k github.com/Yolean/kubernetes-kafka/variants/dev-small/?ref=v6.0.3

role.rbac.authorization.k8s.io/pod-labler unchanged
clusterrole.rbac.authorization.k8s.io/node-reader unchanged
rolebinding.rbac.authorization.k8s.io/kafka-pod-labler unchanged
clusterrolebinding.rbac.authorization.k8s.io/kafka-node-reader unchanged
configmap/broker-config created
configmap/zookeeper-config created
service/bootstrap created
service/broker created
service/pzoo created
service/zookeeper created
service/zoo created
statefulset.apps/kafka created
statefulset.apps/pzoo created
statefulset.apps/zoo created

Result:
both ZooKeeper and Kafka POD have this error and PODs are crashing in a loop:

mkdir: cannot create directory '/opt/kafka/bin/../logs': Permission denied
Invalid -Xlog option '-Xlog:gc*:file=/opt/kafka/bin/../logs/zookeeper-gc.log:time,tags:filecount=10,filesize=102400', see error log for details.
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
[0.001s][error][logging] Error opening log file '/opt/kafka/bin/../logs/zookeeper-gc.log': No such file or directory
[0.001s][error][logging] Initialization of output 'file=/opt/kafka/bin/../logs/zookeeper-gc.log' using options 'filecount=10,filesize=102400' failed.

This is odd. Are you sure the Kafka POD has the same error? The specific filesystem path is quite interesting.

There can't be any volume mounts at. /opt/kafka because all of the kafka distribution is in there. Thus I suspect your container is running as non-root. On a fresh local cluster I get:

$ kubectl -n kafka exec pzoo-0 -- ls -l /opt/kafka/logs
total 4
-rw-r--r-- 1 root root 1338 Jul 15 07:40 zookeeper-gc.log
$ kubectl -n kafka exec pzoo-0 -- id
uid=0(root) gid=0(root) groups=0(root)

This project always ran its containers as root. It would be preferable to run as non-root but there's some work to be done with that, and it'd probably break existing installs at upgrade. I'm positive to such PRs however, and we'll figure out a migration path.

It seems this is my OpenShift security context configuration. It makes all ids which run Docker containers as non-root. I believe it should work on fresh cluster as you described.

One of the workaround I applied is to create logs folder in the Dockerfile, then it starts.

Fosol commented

@novakov-alexey Can you share your solution? I'm running into this issue presently. I'm also using Openshift with the latest image from confluent confluentinc/cp-zookeeper.

[0.001s][error][logging] Error opening log file '/var/log/kafka/zookeeper-gc.log': Permission denied

@Fosol that was 3 yearsa go, like very long time already :-) As far as I remember I added mkdir <log-dir> to a Dockerfile for Kafka/Zookeeper, so that the directories would be pre-created before any logs are going to be written in runtime. For example:

....
RUN mkdir -r /opt/kafka/logs
....