This Docker image is meant to isolate and stabilize that environment, and should be portable across multiple operating system, as long as Docker is available.
To learn more about the use of containers for research reproducibility, see Carpentries' docker-introduction. For commercial services running containers, see codeocean.com, gigantum, or any of the cloud service providers. For an academic project using containers, see Whole Tale.
NOTE: The image created by these instructions contains binary code that is © Stata. Permission was granted by Stata to Lars Vilhuber to post these images, without the license. A valid license is necessary to build and use these images.
You need a Stata license to run the image. If rebuilding, may need Stata license to build the image.
The Dockerfile contains the build instructions. A few things of note:
Set the TAG
and IMAGEID
accordingly. VERSION
should be the Stata version.
VERSION=17
TAG=$(date +%F)
MYHUBID=dataeditors
MYIMG=stata${VERSION}
The Dockerfile relies on BuildKit syntax, for passing the license information. Use the following if you just want to rebuild the Docker image (will re-use key cached information):
DOCKER_BUILDKIT=1 docker build . \
--secret id=statalic,src=stata.lic.${VERSION} \
-t $MYHUBID/${MYIMG}:$TAG
or, if updating Stata, use the following, which will force an update through Stata:
DOCKER_BUILDKIT=1 docker build . \
--secret id=statalic,src=stata.lic.${VERSION} \
--build-arg CACHEBUST=$(date +%s) \
-t $MYHUBID/${MYIMG}:$TAG
NOTE: Updating Stata actually doesn't work.
...
Removing intermediate container cb12e70b0154
---> 52e8f83a14f8
Successfully built 52e8f83a14f8
List your images:
docker images
output:
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> 52e8f83a14f8 25 seconds ago 665MB
<none> <none> fb095c3f9ade 31 minutes ago 670MB
<none> <none> a919483dbe22 34 minutes ago 107MB
The resulting docker image can be uploaded to Docker Hub, if desired, or any other of the container registries.
docker push $MYHUBID/${MYIMG}:$TAG
We can browse the provided images at https://hub.docker.com/u/dataeditors:
Using a pre-built image on Docker Hub to run a program.
NOTE: because Stata is proprietary software, we need to mount a license file.
NOTE: We are using a working directory of "/code" here - check the Dockerfile for the precise location.
For all the subsequent docker run
commands, we will use similar environment variables:
VERSION=17
TAG=2021-11-17
MYHUBID=dataeditors
MYIMG=stata${VERSION}
STATALIC=$HOME/licenses/stata.lic.$VERSION
or
VERSION=17
TAG=2021-11-17
MYHUBID=dataeditors
MYIMG=stata${VERSION}
STATALIC=$(find $HOME/Dropbox/ -name stata.lic.$VERSION)
docker run -it --rm \
-v ${STATALIC}:/usr/local/stata/stata.lic \
-v $(pwd)/code:/code \
-v $(pwd)/data:/data \
-v $(pwd)/results:/results \
$MYHUBID/${MYIMG}:${TAG}
The above builds and runs the container using Docker. While there is a free Community Edition of Docker, others may prefer to use one of the other container management software, such as Podman or Singularity. For instance, in Singularity, the following works:
singularity run \
-B ${STATALIC}/stata.lic.${VERSION}:/usr/local/stata/stata.lic \
-B $(pwd)/code:/code \
-B $(pwd)/data:/data \
-B $(pwd)/results:/results \
docker://$MYHUBID/${MYIMG}:${TAG}
The docker image has a ENTRYPOINT
defined, which means it will act as if you were running Stata:
docker run -it --rm \
-v ${STATALIC}/stata.lic.${VERSION}:/usr/local/stata/stata.lic \
-v $(pwd)/code:/code \
-v $(pwd)/data:/data \
-v $(pwd)/results:/results \
$MYHUBID/${MYIMG} -b program.do
Your program, of course, should reference the /data
and /results
directories:
global basedir "/"
global data "${basedir}data"
global results "${basedir}results"
// use "${data}/mydata.dta"
// graph export "${results}/figure1.png"
- Adjust the
setup.do
file - list all packages you want installed permanently. - Remember to have the
stata.lic.17
file available - Start your Dockerfile with (adjust the tag)
# syntax=docker/dockerfile:1.2
FROM dataeditors/stata17:2021-11-17
# this runs your code
COPY code/* /code/
COPY data/* /data/
RUN --mount=type=secret,id=statalic,dst=/usr/local/stata/stata.lic /usr/local/stata/stata-mp do /code/setup.do
USER statauser:stata
# run the master file
ENTRYPOINT ["stata-mp","/code/master.do"]
build, and then run this Docker image with
docker run --secret id=statalic,src=stata.lic.${VERSION} \
-v $(pwd)/results:/results \
larsvilhuber/greatpaper:2021-06-08
and the results of running the code (in code
) on the data (in data
) will show up in the results
folder which is local to your workstation.
This entire process could be automated, using Travis-CI or Github Actions. Not done yet.