Adding Dockerfile to build and run
ibnesayeed opened this issue · 18 comments
With the introduction of Multi-Stage Build feature in Docker, it should be very easy to write a Dockerfile that can be used to build OpenWayback from source while still producing a light-weight image for production use.
Additionally, the newly introduced feature of allowing arguments in FROM directive will make it even more friendly to build images for any combination of Java and Tomcat. This could be very handy for testing.
@ibnesayeed Is this something you want to assign yourself to?
Yes, I can take care of it.
@anjackson, what would be a good place to put the binaries and libraries (contents of bin
and lib
folders of the built tar file) inside the Docker image?
Currently, I have the following in my Dockerfile, but some binaries such as cdx-indexer
wont work because I did not copy the libraries in the final image.
ARG MAVEN_TAG=latest
ARG TOMCAT_TAG=latest
ARG SKIP_TEST=false
# Building stage
FROM maven:${MAVEN_TAG} AS builder
COPY . /src
WORKDIR /src
RUN mvn package -Dmaven.test.skip=${SKIP_TEST}
RUN tar xvzf dist/target/openwayback.tar.gz -C dist/target \
&& mkdir dist/target/openwayback/ROOT \
&& cd dist/target/openwayback/ROOT \
&& jar -xvf ../*.war
# Image creation stage
FROM tomcat:${TOMCAT_TAG}
LABEL maintainer="Sawood Alam <@ibnesayeed>"
RUN rm -rf /usr/local/tomcat/webapps/*
COPY --from=builder /src/dist/target/openwayback/ROOT /usr/local/tomcat/webapps/ROOT
COPY --from=builder /src/dist/target/openwayback/bin /usr/local/bin/
VOLUME /data
ENV WAYBACK_BASEDIR=/data \
WAYBACK_URL_SCHEME=http \
WAYBACK_URL_HOST=localhost \
WAYBACK_URL_PORT=8080 \
WAYBACK_URL_PREFIX=http://localhost:8080
The fact that you can use ARG
s in the FROM
directive is very cool. But it breaks the Dockerfile for most commonly-used Linux distros (the one I'm using included). Should we consider an alternate Dockerfile that's compatible with older versions?
@runderwood, the ARG
in FROM
and the Multi-Stage Build, both features are currently only available in Docker's pre-release and should be released under version 17.05 by next month. The ARG
feature would give us flexibility of rapidly building images with various combinations of different versions of Maven, JDK, Tomcat, and JRE. Additionally, the Multi-Stage Build feature allows us to generate light-weight final images, free from the build-time bloat. For example, the following command would build an image named openwayback
with tag minimal
where the code would be built using Maven 3.5 with JDK 7 and then the built artifacts will be packaged in a small Alpine Linux image with Tomcat 7 and JRE 7. If no --build-arg
s are provided then the latest
cached tags will be used for both the base images.
$ docker build --build-arg=MAVEN_TAG=3.5-jdk-7 --build-arg=TOMCAT_TAG=7-jre7-alpine -t openwayback:minimal .
Achieving something like this with older versions of Docker would require multiple Dockerfiles and custom scripts. That said, in my opinion, the advantage of the two features utilized here overweights the backward compatibility.
I would also note the fact that these magical features are only necessary at build time. Once the image is built, it can be pushed to a repository, then a container can be run from it using an older docker engine.
@ibnesayeed OK. Works for me. Just thought it worth asking.
Any thoughts on where to put the jars
from the lib
folder of build artifacts?
The jars
needed for the cdx-indexer
should already exist in the image you are building at /usr/local/tomcat/webapps/ROOT/WEB-INF/lib
. With how the command line scripts in the bin
directory are written, they will put *.jar
files on your classpath based on them being at $WAYBACK_HOME/lib. One way you could run cdx-indexer
with what you currently seem to be doing would be to set the WAYBACK_HOME
environment variable to /usr/local/tomcat/webapps/ROOT/WEB-INF
. I don't have a recent enough version of Docker installed to run your Dockerfile to confirm that though.
Thanks @ldko. Java is not the language I work with very often, so there are always certain pieces that I am not sure about. And I was not sure if the bin directory as it is packed inside the built artifact is part of the PATH
as default, if not then I will have to set that in the container. Also, I thought the lib
and bin
directories were placed out side of the war file and not inside WEB-INF/lib
. However, if placing them there would do the trick then it is the simplest way I can think of. I will experiment with this tonight and tell my findings.
When we build it using mvn package
and extract dist/target/openwayback.tar.gz
file, two lib
directories are created. one is outside the webapp and one inside the webapp under WEB-INF
. The one inside WEB-INF
has 79 files while the one outside has only 62. Here is the comm
view with common files removed while first column shows unique files in outside lib and second column shows unique files in the inner lib directory.
antlr-2.7.5.jar
arq-2.2.jar
arq-extra-2.2.jar
commons-cli-1.0.jar
commons-cli-1.2.jar
concurrent-jena-1.3.2.jar
foresite-0.9.jar
hadoop-ant-0.20.2-cdh3u4.pom
icu4j-3.4.4.jar
iri-0.5.jar
jdom-1.0.jar
jena-2.5.5.jar
jenatest-2.5.5.jar
json-jena-1.0.jar
log4j-1.2.12.jar
log4j-1.2.17.jar
lucene-core-2.2.0.jar
rome-0.9.jar
stax-api-1.0.1.jar
stax-api-1.0.jar
wstx-asl-3.0.0.jar
xalan-2.7.0.jar
xercesImpl-2.7.1.jar
xml-apis-1.0.b2.jar
xmlParserAPIs-2.0.2.jar
This shows that outside lib directory has newer versions of commons-cli
, log4j
, and stax-api
. I think the packaging needs some update for consistency, unless there is a reason why it is the way it is.
Thanks @ldko, setting the WAYBACK_HOME
environment variable to /usr/local/tomcat/webapps/ROOT/WEB-INF
did the trick. You are awesome.
If someone can install the pre-release of Docker and try building the image, that would be great. Any other reviews or comments are welcome on the PR #344 which I think is safe to merge as it does not mess with the existing system.
I will try to document its usage in the wiki later.
Docker v17.05.0-ce was released yesterday (no more a release candidate). Hence it should be easy for anyone to upgrade their Docker Engine to the latest version and test the PR #344. I personally think it is safe to merge now, but reviews are welcome.
Thanks for letting us know the compatible Docker has been released. I intend to test it soon--just haven't gotten to it yet.
I have added a basic Docker documentation in the wiki. Please feel free to modify it for accuracy, clarity, or expansion.
Closing this as it is implemented in PR #344 and merged.