/teipublisher-docker-compose

docker compose configuration for running TEI Publisher with additional services

Primary LanguageDockerfile

Docker Compose Configuration

This repository contains a docker compose configuration useful to run TEI Publisher and associated services. Docker compose allows us to orchestrate and coordinate the various services, while keeping each service in its own, isolated environment. Setting up a server via docker compose is fast as everything comes preconfigured and you don't need to install all the dependencies (like Java, eXist-db, Python etc.) by hand. On the downside, it certainly introduces some overhead and may never be as fast as a server, which is properly maintained. For smaller, low-traffic projects docker is a viable and cheap alternative though.

For security reasons, it is recommended to not expose TEI Publisher and eXist-db directly, but instead protect them behind a proxy. The docker-compose file therefore sets up an nginx reverse proxy.

The following services are configured by the docker-compose:

  • publisher: main TEI Publisher application
  • ner: TEI Publisher named entity recognition service
  • frontend: nginx reverse proxy which forwards requests to TEI Publisher
  • certbot: letsencrypt certbot required to register an SSL certificate

Clone this repository to either your local machine or a server you are installing. By default it will build and deploy the main TEI Publisher application from the master branch. The named entity recognition service is pulled as image from the corresponding github package repository.If you do not need or want the named entity recognition service, comment out the corresponding section in docker-compose.yml, including the depends_on: ner above. TEI Publisher will still work.

By default, the compose configuration will launch the proxy on port 80 of the local host, serving only http, not https. This configuration is intended for testing, not for deployment on a public facing server.

Running

To build all services, call

docker compose build --build-arg ADMIN_PASS=my_pass

where my_pass sets the password for the eXist admin user (recommended). You can remove the --build-arg parameter entirely to keep an empty password.

To start, simply call

docker compose up -d

Afterwards you should be able to access TEI Publisher using http://localhost. Additionally eXide can be accessed via http://localhost/apps/eXide (on a production system you want to disable that).

Customize the Configuration

The default configuration exposes the TEI Publisher application itself. Instead you may want to build and deploy a custom application generated by TEI Publisher. The necessary steps are:

  1. fork this repository and modify it to build your application instead of TEI Publisher
  2. clone your customized repository to a server of your choice

1. Fork this repo and customize it

Fork tei-publisher-docker-compose to your own git account and clone it to apply some modifications:

1.1. create a Dockerfile

By default our configuration uses the included Dockerfile to build the application. If you already have a Dockerfile in your app repository, ignore this section and change the build context in docker-compose.yml to point to the git repository in which your Dockerfile resides.

Rename the included template Dockerfile.tmpl to Dockerfile (overwriting the existing one) and open it in an editor. By going through the file it should be easy to see, which part needs to be changed. The relevant lines you should pay attention to are:

# replace with name of your edition repository and choose branch to build
ARG MY_EDITION_VERSION=master
...
# Build my-edition
RUN  git clone https://github.com/my-github-user/my-edition.git \
    && cd my-edition \
    && echo Checking out ${MY_EDITION_VERSION} \
    && git checkout ${MY_EDITION_VERSION} \
    && ant
...
COPY --from=tei /tmp/my-edition/build/*.xar /exist/autodeploy/

By default we assume you're app is compatible with the libraries used by TEI Publisher 8. If not, change the specified versions accordingly:

ARG TEMPLATING_VERSION=1.1.0
ARG PUBLISHER_LIB_VERSION=3.0.0
ARG ROUTER_VERSION=1.8.0

1.2. modify conf/default.conf

This is the default proxy configuration used for local testing. Replace the two lines referring to /apps/tei-publisher:

proxy_pass http://docker-publisher/exist/apps/my-edition$request_uri;
proxy_redirect http://$host/exist/apps/my-edition/ /;

For deployment on a public server you would need to apply the same to your copy of conf/example.com.tmpl (see next section).

You could now start testing your configuration on your local machine.

2. Deploy to a Public Server

Rent a cloud server which has docker enabled. There are various offers on the market. A good specification would include 4 gb of RAM and 2 vCPU, which you can get for less than 10 Euro per month.

Once you have root access to your server, ssh into it and clone your customized docker compose configuration repository. Build and start the services as described. Depending on the configuration of your server, you may or may not have docker compose installed already. If it is not available (but docker itself is), follow the instructions in the docker documentation.

Acquire an SSL certificate

You may now want to make your new server available under a custom domain. For this you should acquire an SSL certificate to enable users to securly connect via https. The compose configuration is already prepared to make this as easy as possible.

  1. Copy the nginx configuration file conf/example.com.tmpl to e.g. conf/my.domain.com.conf, where my.domain.com would correspond to the domain name of the webserver you are configuring the service for

  2. Open the copied file in an editor and replace all occurrences of example.com with your domain name. Important: this also applies to the commented out SSL section, which you will enable later below.

  3. Change the name of the upstream entry to a unique name (otherwise it will collide with the default config):

     upstream docker-publisher.example.com {
         server publisher:8080 fail_timeout=0;
     }
    

    Change the two references to the docker-publisher upstream server below accordingly (including the commented out SSL section):

    proxy_pass http://docker-publisher.example.com/exist/apps/tei-publisher$request_uri;
    ...
    proxy_pass http://docker-publisher.example.com/exist$request_uri;
    
  4. Start the services to acquire SSL certificates in the next step using docker compose up -d

  5. Run the following command to request an SSL certificate for your domain, again replacing the final example.com with your domain name:

    docker compose run --rm  certbot certonly --webroot --webroot-path /var/www/certbot/ -d example.com

    This will ask you for an email address, verify your server and store certificate files into certbot/conf/.

  6. In the nginx configuration file, uncomment the SSL section by removing the leading #

  7. Stop and restart the services:

    docker compose restart

Backing up eXist-db

If you would like to create regular backups of the data in your eXist-db:

  1. edit docker-compose.yml and enable the volume mapping for /exist/backup:
    # uncomment to map eXist-db backups to local directory
    - ./backup:/exist/backup
    
  2. retrieve the eXist-db configuration file from the running docker container with
    docker compose cp publisher:/exist/etc/conf.xml .
    
  3. edit conf.xml and find the section referring to consistency checks and backups. Uncomment the system job, specify the backup directory and a time (cron syntax) to trigger the backup:
    <job type="system" name="check1" 
       class="org.exist.storage.ConsistencyCheckTask"
       cron-trigger="0 0 4 * * ?">
       <parameter name="output" value="/exist/backup"/>
       <parameter name="backup" value="yes"/>
       <parameter name="incremental" value="no"/>
       <parameter name="incremental-check" value="no"/>
       <parameter name="max" value="2"/>
    </job>
  4. copy the conf.xml back to the docker container:
    docker compose cp conf.xml publisher:/exist/etc
    
  5. restart the container:
    docker compose restart publisher
    

Certificate Renewal

The LetsEncrypt SSL certificate issued above will only be valid for a certain duration and needs to be renewed from time to time. We'll thus install a cron job, which calls the script certbot-renew.sh once every day to check if the certificate needs to be renewed.

  1. edit certbot-renew.sh and replace example.com with the hostname for which you acquired a certificate:
    CERTFILE=./certbot/conf/live/example.com/cert.pem
  2. register a cron job to call this script once a day. Call crontab -e and add a line:
    59 18 * * * /root/my-edition-docker/certbot-renew.sh
    
    replacing /root/my-edition-docker with the correct path to wherever you cloned the configuration.

Expose the Full eXist-db for Development

The default configuration will hide most of eXist-db behind the proxy, providing access only to the chosen app. During development, you may instead want to expose everything, i.e. including dashboard, eXide etc. To do so, change the nginx configuration:

  1. Remove the section referring to eXide
  2. Change the / section to read, replacing docker-publisher with whatever you chose as name for the upstream:
location / {
   # change upstream server placeholder 'docker-publisher' below to what you configured above for upstream
   proxy_pass http://docker-publisher$request_uri;
   proxy_redirect http://$host/ /;
   proxy_set_header   Host $host;
   proxy_set_header   X-Real-IP $remote_addr;
   proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
   proxy_set_header   X-Forwarded-Host $server_name;
   # proxy_cookie_path /exist /;
   client_max_body_size  512m;
}

Finally, change the CONTEXT_PATH environment variable in docker-compose.yml to read auto:

environment:
   NER_ENDPOINT: http://ner:8001
   CONTEXT_PATH: auto

After restarting docker compose you should be redirected to the dashboard as the main entry point.