RocketChat/server-snap

Fix the amd64 arch "mongo anomaly" problem

Sing-Li opened this issue · 1 comments

In our haste to get snap into the hands of of as many as fast as possible, we've hardcoded mongo download URL into our now long-in-production snapcraft.yaml:

https://github.com/RocketChat/Rocket.Chat/blob/develop/.snapcraft/stable/snapcraft.yaml#L69

and

https://github.com/RocketChat/Rocket.Chat/blob/develop/.snapcraft/edge/snapcraft.yaml#L69

This, of course, renders that particular snapcraft.yaml useless for architectures other than amd64.

The multi-arch version on this repository uses the distribution specific mongo, which is tested by ubuntu maintainers, and should now work across architectures.

In theory, we can just switch over to this multi-arch snapcraft.yaml.

In practice, there is a major problem that a snap upgrade will bring down all currently working amd64 snap instances with no way to roll back. This will impact thousands.

Because the hard-coded mongo is version 3.2 ...and most of the distribution specific ones are still in 2.x ... even exporting and importing data may not ensure compatibility. And we really don't want to trouble users with running instance to go through such pain.

Due to the above, what we need to do now in order to have one unified maintainable snapcraft.yaml is:

  • use the $SNAP_ARCH variable to realize that we're in an amd64, and for that architecture only - pull the 3.2 hardcoded mongo link as before
  • for all other architectures, use the distribution specific mongo

Until recently, this would have required the creation of a custom snap part.

But the new scriplets capability may make this possible with significantly less effort. The scriplet can have access to the $SNAP_ARCH env var and then override the step accordingly.

Scriplet info (thx aaron) https://insights.ubuntu.com/2017/02/02/run-scripts-during-snapcraft-builds-with-scriptlets/

A lot of work has been on-going to solve this problem behind the scene. Thanks to @elespike

We will be updating this snapcraft.yaml - and evolve it to be the one and only snapcraft.yaml to build snaps for all the architectures.

This "single snapcraft.yaml" will build snap for all architectures - but the resulting binary snap will all INCLUDE distro-specific mongo --> except in the case of amd64, where it will include mongo 3.2 in exactly the same way as our production amd64 snap does today.

This approach will leave the current large base of amd64 users undisturbed, until the mainstream distribution on amd64 catches up to mongo 3.2 (we will make further changes at that time - likely at least a year away).

Before this issue can be closed, we must still:

  • migrate safely the current population of amd64 users from non-replicaset database to a replicaset database (for oplog support) without downtime or data corruption

As well, we need to test thoroughly after the fix of:

#6