Bad permissions on a trusted container, but correct permissions locally
florentx opened this issue · 16 comments
I've hit a bug where the Unix permissions are wrongly set when the image is built automatically (trusted build) but when I run the build locally (using the same Dockerfile) everything is fine.
How to reproduce:
$ docker --version
Docker version 0.11.1, build fb99f99
$ docker run -it tinyerp/sandbox-postgresql
root@4871d1b57a8e:/# service postgresql start
* Starting PostgreSQL 9.3 database server [ OK ]
root@4871d1b57a8e:/# sudo -u postgres createdb testdb
createdb: database creation failed: ERROR: could not create directory "base/16385": Permission denied
root@4871d1b57a8e:/# ls -l /var/lib/postgresql/9.3/main/
total 64
-rw------- 1 postgres postgres 4 May 18 16:16 PG_VERSION
drwxr-xr-x 8 root root 4096 May 18 16:20 base
drwx------ 2 postgres postgres 4096 May 18 16:20 global
drwx------ 2 postgres postgres 4096 May 18 16:20 pg_clog
drwxr-xr-x 6 root root 4096 May 18 16:20 pg_multixact
drwx------ 2 postgres postgres 4096 May 18 16:20 pg_notify
drwx------ 2 postgres postgres 4096 May 18 16:16 pg_serial
drwx------ 2 postgres postgres 4096 May 18 16:16 pg_snapshots
drwx------ 2 postgres postgres 4096 May 18 16:20 pg_stat
drwx------ 2 postgres postgres 4096 May 18 16:26 pg_stat_tmp
drwx------ 2 postgres postgres 4096 May 18 16:20 pg_subtrans
drwx------ 2 postgres postgres 4096 May 18 16:16 pg_tblspc
drwx------ 2 postgres postgres 4096 May 18 16:16 pg_twophase
drwx------ 3 postgres postgres 4096 May 18 16:20 pg_xlog
-rw------- 1 postgres postgres 133 May 18 16:20 postmaster.opts
-rw------- 1 postgres postgres 98 May 18 16:20 postmaster.pid
root@4871d1b57a8e:/#
The directories base
and pg_multixact
are wrongly owned by root
when they should be owned by postgres
.
This is the Dockerfile published (https://index.docker.io/u/tinyerp/sandbox-postgresql/):
# DOCKER-VERSION 0.11.1
FROM ubuntu:14.04
RUN mv /usr/bin/ischroot /usr/bin/chroot.orig \
&& ln -s /bin/true /usr/bin/ischroot \
&& export DEBIAN_FRONTEND=noninteractive LANG && apt-get update \
&& apt-get install -y --no-install-recommends language-pack-en \
&& update-locale LANG=en_US.UTF-8 && . /etc/default/locale \
&& apt-get install -y postgresql-9.3
RUN pg_ctlcluster 9.3 main start && pg_ctlcluster 9.3 main stop
CMD ["/bin/bash", "--login"]
The last RUN pg_ctlcluster
instruction starts and stops the PostgreSQL server in order to create a DB user. (I removed the createuser -d openerp
while troubleshooting the issue).
I'm puzzled why it builds without this error locally, but the image is built wrong on the public registry.
according to Docker support, it could be related to issue #4068 which is fixed in 0.10.
However the trusted builders run an outdated version of Docker.
The team plans two improvements:
- upgrade the trusted builders to the latest Docker version
- print the version of Docker used to create each trusted build
after 6 weeks and many exchanges with support@docker.com, I can only confirm that the issue still exists with 1.0 (on trusted builds only) : I've retried this morning
@florentx I've pulled your image and it looks like this problem has been solved.
Can you confirm, please?
I'm having the same issue on didrocks/docker-udtc. See the linked on github Dockerfile. I have no issue on a local build, but I can't even su - user (bad permission on /lib, /usr and multiple sub directories:
ls -l /
total 68
drwxr-xr-x 2 root root 4096 Jul 23 08:18 bin
drwxr-xr-x 2 root root 4096 Jul 23 08:16 boot
drwxr-xr-x 4 root root 4096 Jul 23 08:52 dev
drwxr-xr-x 197 root root 4096 Jul 23 08:52 etc
drwxr-xr-x 3 root root 4096 Jul 23 08:28 home
d--x--x--- 29 root root 4096 Jul 23 08:52 lib
drwxr-xr-x 2 root root 4096 Jul 17 03:34 lib64
drwxr-xr-x 2 root root 4096 Jul 17 03:34 media
drwxr-xr-x 2 root root 4096 Apr 10 22:12 mnt
drwxr-xr-x 2 root root 4096 Jul 17 03:34 opt
dr-xr-xr-x 327 root root 0 Jul 23 08:52 proc
drwx------ 2 root root 4096 Jul 17 03:38 root
drwxr-xr-x 12 root root 4096 Jul 23 08:26 run
drwxr-xr-x 2 root root 4096 Jul 23 08:18 sbin
drwxr-xr-x 2 root root 4096 Jul 17 03:34 srv
dr-xr-xr-x 13 root root 0 Jul 23 08:52 sys
drwxrwxrwt 2 root root 4096 Jul 23 08:26 tmp
d--x--x--- 35 root root 4096 Jul 23 08:52 usr
d--x--x--- 39 root root 4096 Jul 23 08:52 var
Dockerfile at: ubuntu/ubuntu-make@f7f7a4d
I believe I am also seeing this same issue. My Dockerfile creates a user using useradd
and then switches to that user to install a few things in the home directory.
When I build the image locally, everything is /home/build
is owned by the build user. The image pulled from the automated build, however, has incorrect permissions on the /home/build
and /home/build/.composer
directory, although the permissions are correct on the bash files that useradd
created from /etc/skel
.
@unclejack I've pulled the image, and I still see the same issue described in #5892 (comment)
Hello, amazing, I was testing an other Postgres image yesterday and ran into the exact same issue and Google brought me to #5892 so yes I think there is a Docker issue.
I also have a permission issue with automated build and docker, and wonder if this could be related or not.
I'm building a Go image based on Fedora:20 that works perfectly locally (Docker 1.1) but suddenly fails during automated build with a "Permission denied" error:
---> d3336af7fc3f
Removing intermediate container 8e1511e200b3
Step 13 : RUN /bin/go-build
---> Running in 1c98a80d30a6
�[91m/bin/sh: /bin/go-build: Permission denied
�[0m
The command [/bin/sh -c /bin/go-build] returned a non-zero code: 126
There's no manual change of ownership for any file or folder that I would know of (maybe during building Go itself).
I am having this exact same problem now with liftoff/gateone image. Specifically, build bkuy8gznu3xbuq3gqmkiguk. When trying to run Gate One like so:
docker run -t --name=gateone -p 10443:8000 liftoff/gateone
...I get the following exception:
[E 140822 00:13:49 websocket:372] Uncaught exception in /ws
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/tornado/websocket.py", line 369, in _run_callback
callback(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/gateone-1.2.0-py2.7.egg/gateone/core/server.py", line 1924, in on_message
self.actions[key](value)
File "/usr/local/lib/python2.7/dist-packages/gateone-1.2.0-py2.7.egg/gateone/core/server.py", line 2772, in get_theme
for app in os.listdir(applications_dir):
OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7/dist-packages/gateone-1.2.0-py2.7.egg/gateone/applications'
I had a look inside the container using nsenter and can see the problem is with the directory permissions inside the gateone module:
root@9f08da6bbf22:/usr/local/lib/python2.7/dist-packages/gateone-1.2.0-py2.7.egg# ls -l
total 20
drwxr-xr-x 2 root root 4096 Aug 21 23:21 EGG-INFO
d--x--x--- 14 root root 4096 Aug 21 23:30 gateone
drwxr-xr-x 2 root root 4096 Aug 21 23:21 onoff
drwxr-xr-x 2 root root 4096 Aug 21 23:21 terminal
drwxr-xr-x 2 root root 4096 Aug 21 23:21 termio
That's the same permissions mask (d--x--x---) reported @didrocks (except in his case it was /usr and /var which is much worse). This problem does not occur when I build the Dockerfile manually.
For reference, I tried setting umask 022
at the beginning of my RUN statement but that didn't do anything. The permissions problem starts at /usr and /lib and goes all the way through /etc, /usr and /var:
root@27076caf92b9:/# ls -l | grep d--x--x---
d--x--x--- 109 root root 4096 Aug 22 01:02 etc
d--x--x--- 6 root root 4096 Aug 22 01:00 gateone
d--x--x--- 27 root root 4096 Aug 22 01:00 usr
d--x--x--- 24 root root 4096 Aug 22 00:59 var
root@27076caf92b9:/# ls -lR /usr | grep d--x--x---
d--x--x--- 44 root root 4096 Aug 22 01:00 lib
d--x--x--- 17 root root 4096 Aug 22 01:00 local
d--x--x--- 4 root root 4096 Aug 22 00:59 gcc
d--x--x--- 5 root root 4096 Aug 22 00:59 x86_64-linux-gnu
d--x--x--- 7 root root 4096 Aug 22 01:00 lib
d--x--x--- 7 root root 4096 Aug 22 01:00 python2.7
d--x--x--- 20 root root 4096 Aug 22 01:00 dist-packages
d--x--x--- 8 root root 4096 Aug 22 01:00 gateone-1.2.0-py2.7.egg
d--x--x--- 14 root root 4096 Aug 22 01:00 gateone
d--x--x--- 5 root root 4096 Aug 22 01:00 applications
d--x--x--- 5 root root 4096 Aug 22 00:59 apport
d--x--x--- 4 root root 4096 Aug 22 00:59 bash-completion
d--x--x--- 4 root root 4096 Aug 22 00:59 debhelper
d--x--x--- 10 root root 4096 Aug 22 00:59 initramfs-tools
d--x--x--- 5 root root 4096 Aug 22 00:59 lintian
d--x--x--- 4 root root 4096 Aug 22 00:59 upstart
d--x--x--- 8 root root 4096 Aug 22 00:59 scripts
d--x--x--- 9 root root 4096 Aug 22 00:59 de
d--x--x--- 7 root root 4096 Aug 22 00:59 es
d--x--x--- 9 root root 4096 Aug 22 00:59 fr
d--x--x--- 7 root root 4096 Aug 22 00:59 it
d--x--x--- 7 root root 4096 Aug 22 00:59 ja
d--x--x--- 7 root root 4096 Aug 22 00:59 pl
d--x--x--- 6 root root 4096 Aug 22 00:59 pt_BR
d--x--x--- 7 root root 4096 Aug 22 00:59 sv
d--x--x--- 5 root root 4096 Aug 22 00:59 Debian
d--x--x--- 5 root root 4096 Aug 22 00:59 Debhelper
I'm tempted to just write a script that finds and fixes all these before Gate One starts but that seems like huge overkill for what is obviously a bug in Docker's build system.
Ran into this problem with trusted builds (https://registry.hub.docker.com/u/tanmaykm/juliabox_dev/) and wasted a ton of effort trying to figure it out before I ran into this report.
My Dockerfile creates a user and switches to it before running further commands. The user's home directory ownership is incorrect and files created with some of the commands have incorrect ownership.
Added a chown
to counter it, but that too is not reliable.
Investigating this a bit, I was wondering why there aren't more folks running into this issue. It seems that it only impacts images that run their services as non-root. Sadly, almost all the public Dockerfiles and images run everything as root which is the opposite of security best-practices.
Considering the impact of that I believe this bug should be considered a major security problem. It doesn't bode well for the term, "trusted" that gets attached to these images.
@unclejack are you looking into this one?
I am also running into this issue. I found that if I have a Dockerfile that just adds a user, permissions are fine whether I run locally or from a trusted build. But if after I have created the user, I modify the home directory of that user from the Dockerfile, it changes the permissions of the entire home directory to root:
Dockerfile1 snippet:
RUN (adduser --disabled-password --gecos "" guest && echo "guest:guest"|chpasswd)
Dockerfile2 snippet:
RUN (adduser --disabled-password --gecos "" guest && echo "guest:guest"|chpasswd)
RUN mkdir /home/guest/scripts
Fine when run locally. But after the automated build, in the container for Dockerfile1, /home/guest is owned by guest. But in the container for Dockerfile2, /home/guest is owned by root. chown
in the Dockerfile didn’t work for me. For now I ended up making a different directory outside of /home/guest and then symbolic linking /home/guest/scripts to that, but ideally that wouldn’t be necessary.
I've tested today, and it seems that the Trusted Build machines do not have the fix yet.
Please keep us informed when it is updated.
Thank you