OCR-D/ocrd_all

Docker: interference with older versions of core

Closed this issue · 3 comments

We build Docker by setting ocrd/core:$CORE_VERSION as base stage, with CORE_VERSION being the ref currently checked in as submodule. That's not ideal, since we then make all, which installs core again, this time as editable install with /build/core as source location, so both installations will be mixed (so derived stages cannot simply update core again).

Since in core 2.61.0 the distribution changed from ocrd + ocrd_utils + ocrd_models etc to just ocrd, but as a workaround to not break things it was decided to still ship the other packages as aliases, now all of the other packages have to be uninstalled before proceeding.

(This is why #412 is still broken in Docker despite #415 installing the fixes for OCR-D/core#1195.)

Also, some sub-submodules (e.g. ocrd_fileformat → ocr-fileformat → textract2page) will pull ocrd via PyPI.

Also, some sub-submodules (e.g. ocrd_fileformat → ocr-fileformat → textract2page) will pull ocrd via PyPI.

  • because they may depend on ocrd in certain version ranges incompatible with the base image version, or
  • because they may depend on one of the aliased ocrd* packages (ocrd-utils, ocrd-models, ocrd-modelfactory, ocrd-validators, ocrd-network) which since v2.61 are only available via PyPI
  • because they may have some make install rule or such that uses pip install --force-reinstall or pip install -U

My suggested fix would be to use the same strategy as in #417 – assume core (via target bin/ocrd) is already installed right from the start, i.e. FROM stage.

So in the Makefile, change DOCKER_MODULES to exclude core in all variants. (There still is a secondary rule which would install bin/ocrd via PyPI, but that does not fire because that target should be up-to-date.) To still ensure that we get an editable installation and git clone for core, then also the Dockerfile for core should be changed.