OHDSI/Perseus

Building DQ Dashboard takes over 24 hours

Closed this issue · 8 comments

natb1 commented

I'm not sure if this is a bug, or if it's a normal experience. Building the latest (master) commit of the SoftwareCountry/DataQualityDashboard ran for over 24 hours before I accidentally killed it. After restarting it has been running for several hours. It appears to be stuck on step 15/15 of the build step - ./mvnw package. I am not on powerful hardware, but this seem excessive. I have already added platform: linux/amd64 to the docker-compose so I don't think it's a platform issue.

What is the normal experience for building the dashboard, does it normally take a long time?

This varies greatly on the size of the data source but it can take many hours for a complete DQD execution. We've made improvements in the build in the develop branch but it is still something that requires optimization.

natb1 commented

Thanks. That's helpful to keep in mind. Just to clarify I'm referring to the build of the Docker image in the SoftwareCountry fork of DataQualityDashboard. That image is used in the documented process for setting up Perseus, which is then used to do the DQ analytics after loading data. In theory no data has been loaded or analyzed at the time of building the image. I suspect something is hanging in the Dockerfile.

I will investigate more. I was just wondering if anybody was having success building that Dockerfile.

@natb1 as far as I know we don't need more time for this. @kostiushenko and @chMatvey please have a look at the concerns from @natb1 related to Docker file for DQD.

natb1 commented

It does appear to be hanging, I'm still trying to pin down exactly why. If I split the docker file into the two build steps by:

1st: commenting out the second half (see below),
2nd: using the cached mvn package to build the whole thing, then I can get the build to complete.

But if I build the whole thing (docker-compose build --no-cache data-quality-dashboard) then it hangs. So there is some sort of file locking happening (apparently).

Commenting out the second build step. If I cache this first, then the full build will work.

# 1st Build Step
#FROM openjdk:17-alpine as build
FROM openjdk:17-alpine
WORKDIR /workspace/app

# Source
COPY src src
COPY inst/shinyApps/www/css src/main/resources/static/css
COPY inst/shinyApps/www/htmlwidgets src/main/resources/static/htmlwidgets
COPY inst/shinyApps/www/img src/main/resources/static/img
COPY inst/shinyApps/www/js src/main/resources/static/js
COPY inst/shinyApps/www/vendor src/main/resources/static/vendor
COPY inst/shinyApps/www/favicon.ico src/main/resources/static/favicon.ico
COPY inst/shinyApps/www/index.html src/main/resources/static/index.html

# Maven
COPY mvnw .
COPY .mvn .mvn
COPY pom.xml .

RUN tr -d '\015' <./mvnw >./mvnw.sh && mv ./mvnw.sh ./mvnw && chmod 770 mvnw

RUN ./mvnw -X package

# 2nd Run Step
# FROM openjdk:17-alpine

# RUN apk update \
#     && apk add openssh-server \
#     && export ROOTPASS=$(head -c 12 /dev/urandom |base64 -) && echo "root:$ROOTPASS" | chpasswd

# COPY sshd_config /etc/ssh/

# VOLUME /tmp

# ARG JAR_FILE=/workspace/app/target/*.jar
# COPY --from=build ${JAR_FILE} app.jar

# EXPOSE 8001

# ENTRYPOINT ["sh", "-c", "java ${JAVA_OPTS} -jar /app.jar ${0} ${@}"]

I have not been able to get such results.
My average build time is about 4.5 minutes.

# 1st Build Step
FROM --platform=linux/amd64 openjdk:17-alpine as build

WORKDIR /workspace/app

# Source
COPY src src
COPY inst/shinyApps/www/css src/main/resources/static/css
COPY inst/shinyApps/www/htmlwidgets src/main/resources/static/htmlwidgets
COPY inst/shinyApps/www/img src/main/resources/static/img
COPY inst/shinyApps/www/js src/main/resources/static/js
COPY inst/shinyApps/www/vendor src/main/resources/static/vendor
COPY inst/shinyApps/www/favicon.ico src/main/resources/static/favicon.ico
COPY inst/shinyApps/www/index.html src/main/resources/static/index.html

# Maven
COPY mvnw .
COPY .mvn .mvn
COPY pom.xml .

RUN tr -d '\015' <./mvnw >./mvnw.sh && mv ./mvnw.sh ./mvnw && chmod 770 mvnw

RUN ./mvnw package

# 2nd Run Step
FROM --platform=linux/amd64 openjdk:17-alpine

RUN apk update \
    && apk add openssh-server \
    && export ROOTPASS=$(head -c 12 /dev/urandom |base64 -) && echo "root:$ROOTPASS" | chpasswd

COPY sshd_config /etc/ssh/

VOLUME /tmp

ARG JAR_FILE=/workspace/app/target/*.jar
COPY --from=build ${JAR_FILE} app.jar

EXPOSE 8001

ENTRYPOINT ["sh", "-c", "java ${JAVA_OPTS} -jar /app.jar ${0} ${@}"]
natb1 commented

It may have something to do with the version of docker being used. I am using Docker version 20.10.20, build 9fdeb9c. So, this may affect people who are newly installing docker.

I think it's related to this docker/buildx#484 If I follow the guidance in that issue and disable docker buildkit then the build will complete successfully. To mitigate the issue, maybe there is something that can be done about the volume of output from the mvn package?

From what I see, they recommend something like that:

DOCKER_BUILDKIT=0 docker build -t data-quality-dashboard .

As an alternative:

.docker/daemon.json"

Change the value of "buildkit" to false so it looks like this:

{
  "registry-mirrors": [],
  "insecure-registries": [],
  "debug": true
  "experimental": false
  "features": {
    "buildkit": false
  }
}

Another alternative:

echo "
export DOCKER_BUILDKIT=0
export COMPOSE_DOCKER_CLI_BUILD=0" >> $HOME/.bashrc

But I will not be able to check it all on my side.

natb1 commented

Right, if buildkit is disabled then it seems to work. So, I expect the build will fail for anybody with a newer version of docker that uses buildkit.