Why is there volume for data in the first place?
etki opened this issue Β· 24 comments
Hi.
My inquiry may seem strange, but i really don't get it.
Why does MySQL Dockerfile contain VOLUME directive? From my perspective, there are more cons than pros:
- (+) Users that did not specify host mount on startup have a chance to recover their data
- (-) Anonymous volume is terribly hard to search for when container is gone and you have literally hundreds of them
- (-) Volumes consume free space at dramatic speeds. Because htere is no GC. Because they were made to not to be garbage collected, and that means they should not be created automagically, because, in turn, that means they would require garbage collection.
- (-) Users that persist their data will certainly do a regular host mount that renders volume useless.
- (-) Users that do tons of CI builds a day and thought they can finally forget about data bloat when using containers, they, well, are highly annoyed when they discover the consequences.
- (-) You can add a volume later, but you literally can't cancel declared volume. And if you mount it to your host - well, if that's remote host, you have no chance at cleaning volume at the end of the build.
I know i'll cause huge 'watch it, kid!' next second, but shouldn't it be dropped? I really don't see any huge benefits over drawbacks it brings in.
Volumes will also be faster than using the container's internal storage, but I agree it tends to very quickly clutter up the disk with anonymous volumes, since starting the container without any volumes specified is something I at least mostly do for quick testing.
@tianon @yosifkit
What are your thoughts on this? While volume cleanup has become simpler with the Β«docker volumeΒ» commands, do we need to have /var/lib/mysql a volume by default?
I just use a series of alias's and periodically cleanup stopped containers, unused volumes, and dangling images.
alias dclean='docker ps -aq | xargs --no-run-if-empty docker rm'
alias dcleanvol="docker volume ls | awk '/^local/ { print \$2 }' | xargs --no-run-if-empty docker volume rm"
alias ddangling='docker images --filter dangling=true -q | sort -u | xargs --no-run-if-empty docker rmi'With upcoming docker 1.13.0 there will be built-in commands with docker container prune, docker volume prune, docker image prune.
Edit: If you do docker rm -v mysql-container it will also clean up the volumes associated with the stopped container you are deleting. It is automatic on a docker run -it --rm.
@yosifkit this is fine workaround for local (and, probably, swarm - haven't worked with it) containers, but as soon as nomad / kubernetes / other orchestration hero is hit, things go bad, sometimes you don't have easy automation for the node itself at all.
Agree with etki. Biggest problem for us is we can't save data in the image itself. Why not leave it for the users to decide whether they want to use volumes or not. They can always add it later but as etki mentioned there's no way to remove it.
we have the same problem as shitalm - we would like to add data to the image itself so we can e.g. prepare test/demo data or just to deliver static/readonly data. It tedious to have to copy dockerfiles, remove the volume instruction and build it ourselves... I could also live with a separate image with a tag - 5.7-no-volume
@tianon it won't. The data will be stored in other place, but untracked directory will still be created on the host, consuming an inode. This is not as bad, but still something that has zero positive effect.
As far as I know it's now recommended by docker team not to declare volumes in base images (or in this case official images). It would be better in my opinion to document the usage of volume, but not declare it in the image as it depends on the end user case, how he/she is going to store the data (local vs production cases).
Actually, there is one other effect: https://docs.docker.com/engine/reference/builder/#notes-about-specifying-volumes
When installing the standard Debian packages (5.7 and older), /var/lib/mysql will be populated with a database as part of package installation. Since /var/lib/mysql is declared a volume that database will then be discarded.
If we just drop the VOLUME statement, the database will still be there, and no database initialization will be performed for basic testing of the image. I don't think this would require anything more than clearing out the directory after installing, though.
I feel like removing the volume would break many users that rely on the volume when using docker-compose. Compose tries hard to keep the volume between restarts of the container to persist the data and these users would suddenly see new deployments unable to survive a re-creation. My opinion is that if it is such a problem to have a volume defined, then docker needs to provide an unvolume/"don't use any defined volumes" (via Dockerfile and docker run) and users should set automatic volume and image deletion if space/inode usage is a problem.
I would think that most users would rather their database data preserved by default rather than discovering that their data has been automatically deleted when they did a docker-compose restart (or docker stack deploy) after bumping a database version number.
There is not a good alternative for telling the user where persistent data lives. Labels are not standardized and many users skip over the Docker Hub documentation.
Being able to build an image that ships with a database already initialized is still possible and the automatic volume would be left empty.
FROM mysql:5.7
CMD ["--datadir=/sql"]
# assuming ./sql-datadir contains an already initialized database
COPY ./sql-datadir/* /sql/
# on startup the entrypoint script will detect the already initialized database and start right up
# leaving /var/lib/mysql emptyor.... without having to use a different data directory:
FROM mysql:5.7
# ./sql-datadir contains a database dump of *.sql files
COPY ./sql-datadir/* /docker-entrypoint-initdb.d/
# initdb logic will restore the database via the sql files in alphanumeric order on first container start
# users will have to `docker rm -vf sql-container` when a new image is pulled with a new database dump@ltangvald, as for the automatic population of /var/lib/mysql/ by the apt package, that is already deleted as soon as it is created (since the volume is declared later).
Would it be simple to tag the image twice for both use cases? e.g. do everything the same sans VOLUME in the Dockerfile and tag it something like #-no-volume (naming is hard) then simply have another Dockerfile do the below and tag with the existing tags:
FROM mysql:#-no-volume
VOLUME ["/var/lib/mysql"]
Image behavior stays the same for existing tags while we allow the other use case for those who want it.
@yosifkit I hadn't considered the compose use case
I agree this would probably be too big a behavior change to the existing images.
@bflad In general I don't think we want more files to maintain (though it's simple enough), but when/if we get a template system in place (discussed in issue #289) this might be an option.
@yosifkit I agree with you regarding an "UNVOLUME" command, however I don't see Docker implementing that anytime in the near future.
Until that occurs we're basically stuck telling educated Docker users that they need to go copy the Dockerfile from the MySQL image that they want and create their own image with the VOLUME line commented/deleted. Preventing the user from automatically receiving potential security updates or writing a script to automate the process (which makes me uneasy, but I have seriously considered it...).
I'm a heavy user of Compose (doing a lot of local-dev with Docker) and would have been perfectly fine seeing the documentation on Docker Hub stating that I need to define a volume in my run command or docker-compose service.
I know you stated that many users skip over the Docker Hub documentation, but the image is already relatively useless if you don't scroll down to read the section regarding environment variables. The Compose/Stack documentation appears before that section, which could certainly include a sample Volume definition with a comment above it, something like:
# Use root/example as user/password credentials
version: '3.1'
services:
db:
image: mysql
restart: always
environment:
MYSQL_ROOT_PASSWORD: example
# Use a volume to support persistent storage on container restart.
volumes:
- data-volume:/var/lib/mysql
adminer:
image: adminer
restart: always
ports:
- 8080:8080
volumes:
data-volume:I'd be happy to write a suggestion for the "Where to Store Data" section as well, if that's a hangup.
however I don't see Docker implementing that anytime in the near future.
If someone wants to work on that, it may be implemented, see moby/moby#3465 (comment) and moby/moby#3465 (comment)
Nobody so far offered working on it though
The request for docker to support an "UNSET" feature is there only to help people to cope with bad images.
In addition, it is a workaround that will force everyone to create custom images to unset something that should have never been set.
Setting an anonymous volume is clearly a bad practice everywhere discouraged. In my company, we use lots of different database docker images and the MySQL ones are the only ones with this annoying problem.
About this sentence:
I feel like removing the volume would break many users that rely on the volume when using docker-compose.
it is completely wrong.
In every company and project I have ever worked, when it is desired to persist data between docker restarts, either you don't delete the container or you explicitly mount a volume. I have never seen someone relying on (or being in love with) anonymous volumes in the real world.
If you don't want to break the (frustrating) behavior of this image, you should really adopt another tag and offer both the alternatives.
Anyway, from my point of view, the default tags (e.g. "5.7") should offer the behavior that everybody expects, which is without the volume; then you can extend the default image adding the VOLUME option and offer another specific tag (e.g. "5.7-persistent" or whatever). Obviously, this should be clearly reported and highlighted in the documentation.
I would second the request to remove the volume definition from the Dockerfile.
Dockerfiles should merely define how an image is built (built-time configuration) and not how a container is run (runtime configuration), and I deem the definition of volumes and also ports as runtime configuration.
As a user of docker-compose I see the built- and runtime-configuration nicely separated, the docker-compose.yml refers to the build environment (including the Dockerfile) for the underlying image and it allows to define the runtime configuration of the actual container (including, volumes and ports).
Just hit this issue as well, took several hours of a junior devs time before we found the underlying cause as we didn't think this would be included by default and was a big surprise. Having a no-volume tag would suffice for me as well understanding the compatibility concerns but I guess we're stuck with a forked Dockerfile for now.
I'm confused about this part:
I feel like removing the volume would break many users that rely on the volume when using docker-compose. Compose tries hard to keep the volume between restarts of the container to persist the data and these users would suddenly see new deployments unable to survive a re-creation.
docker-compose up after docker-compose down creates a new anonymous volume. While older volume is preserved on host's disk, it is not reused by a re-created container unless explicitly specified.
How is docker-compose relevant to the problem? Could someone give me an example use case which will be affected by the removal of VOLUME direction?
$ cat Dockerfile
FROM bash
VOLUME /foo
$ cat docker-compose.yml
version: '3.8'
services:
bash:
build: .
tty: true
$ docker-compose build
...
$ docker-compose up -d
Starting tmp_bash_1 ... done
$ docker-compose exec bash touch /foo/bar
$ docker-compose exec bash ls /foo
bar
$ docker-compose up -d --force-recreate
Recreating tmp_bash_1 ...
$ docker-compose exec bash ls /foo
bar(Docker Compose works extra hard to keep even anonymous unspecified volumes around and attached to the appropriate container.)
@tianon Couldn't you just add an explicit anonymous volume to the docker command or docker-compose file, like docker run -v /foo image, or in the docker-compose file:
volumes:
- /fooThe above has the benefits of having the volume defined explicitly, so as to not catch people by surprise with lots of anonymous volumes that shouldn't have been created, and not reusing persisted data that should not have been persisted in the fist place (and also to not have the need of hacks like defining the mysql data directory in another place, that doesn't stop the volume creation, anyway).
Furthermore, as @ufoscout said:
In every company and project I have ever worked, when it is desired to persist data between docker restarts, either you don't delete the container or you explicitly mount a volume. I have never seen someone relying on (or being in love with) anonymous volumes in the real world.
Last year oracle removed volume from their official image, and I don't know about it impacting people negatively (although the mysql image is probably more used):
Please consider removing VOLUME declarations in Dockerfile. In my opinion it is an anti-pattern to use them, and they create more problems than they solve.
6 years later and it is still there, 6 years later and I will still have to build my very own mysql base image where I have just had to clone this repo to remove the volume command.
The stock behavior is so strange. I don't understand this extraordinary desire to cater to utter noobs who don't know about persistent in Docker while causing a real detriment to knowledgeable docker users.
Snowflakes are bad and these images are snowflakes.