pwncollege/dojo

Improve "continuous upgrading" story

Closed this issue · 3 comments

Occasionally there are changes we make to the dojo that require additional configuration values in config.env or other files (such as /etc/docker/daemon.json) to be updated within the dojo. Our current approach is to make the changes manually on prod, and implement a working default in container-setup.sh. This works for "fresh installs", it also hard-breaks the dojo for community members that are following master.

Perhaps we should consider adding some kind of "sanity check" script similar to container-setup.sh that fires with dojo launch and outputs a user-friendly message oppose to relying on whatever error occurs from these breaking changes?

Can you give me an example failure case?

On my end, I am assuming that any end users that want to upgrade their dojo to a newer commit are going to bring down their dojo instance, git pull, docker build, and then docker run. This should be a relatively fast operation. I'm fine with our ability to hot patch the dojo requiring "arcane" knowledge (e.g. dojo update is chill if its just ctfd changes, not so chill if you're making infra-level changes like new config.env values or daemon.json changes). If you don't have such "arcane" knowledge, just bring it down, rebuild, and bring it back up; this operation should take less than a minute in most case (and if it doesn't, this is a situation where you definitely can't hot patch, and we would also be doing that long operation with downtime as well).

I agree with @ConnorNelson .

Maybe the real fix is to update the README with a comment on the upgrade process (that Connor mentioned). If people don't follow that in production, then WONTFIX

Example 1

As part of the Windows VM suppport, the dockerd config must be edited. The addition of the necessary dockerd config is added by rebuilding the dojo image itself, not by dojo update. So someone may "update" their dojo instance from the master branch and encounter errors.

Example 2

The recent DB changes added new config.env values. Again, doing a dojo update fails.

You are correct that rebuilding the outer dojo container would resolve these issues, but that isn't clear currently. A README update works for me in that case.