OHDSI/Arachne

Make sure execution engine is online before starting datanode in docker compose file

ablack3 opened this issue · 3 comments

In the docker compose file for deploying Arachne we tell docker that datanode depends on postgres and execution-engine. However this only guarantees that the execution-engine and postgres start before datanode and not that they have completed their startup process.

I have an issue where sometimes when I start the application it will start in docker mode and sometimes it will start in tarball mode even though I have not changed anything. I do have docker mode enabled in execution engine but @konstjar and I were thinking that because execution engine is not fully started before datanode the app will start in tarball mode sometimes. This is non-deterministic behavior. sometimes it starts in docker mode and sometimes it does not.

We can possibly fix it using the wait-for-it script as the entry point to datanode. Perhaps there are other solutions as well.

https://github.com/vishnubob/wait-for-it

  version: '3'
services:

  # Application Postgres Database
  arachne-datanode-postgres:
    image: postgres:15.5-alpine
    container_name: arachne-datanode-postgres
    restart: always
    logging:
      options:
        max-size: 100m
    shm_size: "4g"
    networks:
      - arachne-network
    ports:
      - "127.0.0.1:5434:5432" # Port mapping (host:container)
    volumes:
      - arachne-pg-data:/var/lib/postgresql/data # Volume mount for Arachne PG data
    environment:
      POSTGRES_USER: ohdsi-user
      POSTGRES_PASSWORD: ohdsi-password
      POSTGRES_DB: arachne_datanode

  # Execution Engine
  arachne-execution-engine:
    image: odysseusinc/execution_engine:2.2.1
    platform: linux/amd64
    container_name: arachne-execution-engine
    restart: always
    networks:
      - arachne-network
    ports:
      - "127.0.0.1:8888:8888"  # Port mapping (host:container)
    volumes:
      - /tmp:/tmp
      - /var/run/docker.sock:/var/run/docker.sock
      - /tmp/executions:/etc/executions
      - /Users/ablack/Desktop/ArachneInstall:/runtimes
    environment:
      - applyRuntimeDependenciesComparisonLogic=true
      - libraries.location.strategus=strategus
      - DOCKER_IMAGE_DEFAULT=executionengine.azurecr.io/darwin-base:v0.3
      - ANALYSIS_MOUNT=/tmp/executions
      - DOCKER_ENABLE=true
      - RUNTIMESERVICE_DIST_DEFAULTDESCRIPTORFILE=descriptor_base.json
      - DOCKER_REGISTRY_USERNAME=
      - DOCKER_REGISTRY_PASSWORD=
      - DOCKER_REGISTRY_URL=

  # Arachne Datanode Service
  arachne-datanode:
    image: odysseusinc/arachne-datanode-ce:2.0.2
    platform: linux/amd64
    container_name: arachne-datanode
    restart: always
    networks:
      - arachne-network
    ports:
      - "127.0.0.1:8082:8080" # Port mapping (host:container)
    volumes:
      - arachne-datanode-files:/var/arachne/files  # Volume mount for Arachne data
    env_file:
      - ~/Desktop/ArachneInstall/datanode.env  # Environment variables file
    depends_on:
      - arachne-datanode-postgres
      - arachne-execution-engine

# Volumes for the services
volumes:
  arachne-pg-data:
  arachne-datanode-files:

# Network definition
networks:
  arachne-network:
depends_on:
      - arachne-datanode-postgres
      - arachne-execution-engine
    entrypoint: ["./wait-for-it.sh", "  arachne-execution-engine:8888", "--", "command to start datanode"]

If we use wait-for-it.sh then we need to add this script to the datanode docker image.

discussed with @konstjar. We don't want datanode to depend on execution engine being online when it starts. It should start up and keep looking for execution engine. once execution is online datanode should connect to it. So the order of startup should not matter.

One idea: use https://github.com/vishnubob/wait-for-it in the docker compose file to make sure execution is online before starting the front end. A better solution might be for the front end to have it's own environment variable for docker mode.

Addressed in the #71