Permission error while accessing /tmp/datahub
upendrao opened this issue · 6 comments
Problem description:
datahub-actions component fails to process incoming ingestion requests as it fails to create ingestion recipe yaml file at /tmp/datahub/ingest
folder
Environment:
- datahub helm chart v0.2.144
- Datahub 0.9.6.1
- datahub-actions v0.0.8
- Kubernetes v1.26.0
How to reproduce?
Run datahub-actions in k8s v1.26.0 with following securityContext configuration.
acryl-datahub-actions:
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: "RuntimeDefault"
Observations:
- Pod security context enforces the pods in k8s v1.26.0 requires all pods to be
runAsNonRoot: true
which implies one had to specify a user-id insecurityContext
- The
/tmp/datahub
folder is created and owned by a system user 'datahub' as per Dockerfile here datahub-actions
startup script is started by the user with id 1000 which overrides the intendedUSER datahub
specified in Dockerfile- So all datahub-actions scripts cannot access
/tmp/datahub
folder
Questions?
- What was the need for a special
datahub
system user? - Why do you need to protect
/tmp/datahub
fordatahub
user
We store logs and venv setups in /tmp/datahub during ingestion execution.
We don't have any specific requirements around the datahub
system user vs any other setup, but generally wanted to run the processes as non-root and still have that non-root user be able to access the necessary locations on disk.
Would it be possible for you to run as the user id of datahub
?
It is not possible to specify non-numeric user-id according to securityContext spec
Here is the error upon not providing a user-id to the container
Warning Failed 2s (x2 over 3s) kubelet Error: container has runAsNonRoot and image has non-numeric user (datahub), cannot verify user is non-root (pod: "datahub-acryl-datahub-actions-7b48fdf684-6bzc9_datahub(d73d1648-99d5-453f-a244-91ac1520db36)", container: acryl-datahub-actions)
And it is strictly recommended to runAsNonRoot: true
Which implies that users are forced to provide a numeric user-id that may not match what datahub
user transpires to on the container.
$ id 100
uid=100(_apt) gid=65534(nogroup) groups=65534(nogroup)
$ id 101
uid=101(messagebus) gid=102(messagebus) groups=102(messagebus)
$ id 102
uid=102(datahub) gid=103(datahub) groups=103(datahub)
Following your explanation I see that there is no need to restrict datahub
user to restrict access to /tmp/datahub
folder.
Allowing everyone to read/write to this location would resolve this issue.
Given that datahub is uid 102, would it be possible to set runAsUser: 102
?
Assuming datahub
user id is just a workaround that I tested.
But I don't think that is a solution as we can rely upon it as it was 100
in a previous release and now 102
.
This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io
This issue was closed because it has been inactive for 30 days since being marked as stale.