GoogleCloudPlatform/cloud-foundation-fabric

fast: allow existing users to migrate when incompatible changes are included in new versions

Closed this issue · 12 comments

Describe the bug
Some users would like to keep their changes in sync with this upstream, but the complexity of doing it might outweigh some of the benefits (eg. time saving) if the time to migrate is bigger than the time to build a full FAST structure from scratch.

To Reproduce
After doing a production implementation using v29 try to update with upstream and use v33

This requires going through a few months of changes, and hundreds of lines of codes to identify each mv command, and in some cases it might not be enough as new logic or structures would require further analysis.

Expected behavior
Having clear instructions on how to migrate, or use moved blocks syntax to allow Terraform to automatically identify the changes and make the migrations.

This could be complemented with migration instructions in the release changelog.

Result
Terraform output and/or error messages

Additional context
FAST is meant to be used in organizations for production use cases, and part of this production lifecycle should be keeping the FAST implementation up to date with the best practices implemented in this repo.

Related PRs to this issue:

Gustavo, this will never happen and it was discussed many times before: FAST is a toolkit not a product, and its primary goal is to allow our team to quickly spin up standards compliant LZs.

Once an LZ is deployed, by definition it matches requirements and changes should just follow its natural evolution as requirements change.

Dealing with version upgrades is irrelevant to our primary goal, and would put an impossible strain on our limited resources. We share this code as we hope it's useful to the wider community, but this can only be maintained if it does not become a product but remains a toolkit. Our main job is working with customers, not maintaining code.

I understand, but maybe this should be explicitly stated (?) here: https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/master/fast/README.md or here: https://github.com/GoogleCloudPlatform/cloud-foundation-fabric/blob/master/FABRIC-AND-CFT.md

From your statement, it seems like the the target user for Fabric (or FAST, specifically) shouldn't be Organizations, as it's not a product for Organizations to use directly; this is also confusing when compared with the other Toolkits available at https://github.com/terraform-google-modules

You're citing Fabric, we're discussing FAST. 😀 We're not preventing you from using other toolkits of course, if those are a better match for your needs.

I don’t want to make this a lengthy discussion, but we're talking about Fabric FAST. If FAST isn't aligned with Fabric's technical and philosophical foundations, then perhaps it should be moved to its own repository to make this distinction clearer.

That said, I’m just a user trying to make things easier for others who are arriving at these wonderful Modules/Toolkits, helping them avoid the same hurdles. By no means do I want to make your life harder—just clearer for everyone who isn’t working on these great tools daily.

Fabric is modules. The scope is wildly different from FAST, and those are very easy to keep up to date.

As much as we would like to operate as a product, this is just a side activity and not our main, or even part-time job so I don't see much changing in the future. The main goal is to capture patterns we see in the field and make them repeatable for new deployments, and that is hard enough to do as is. There's just so many hours you can steal from nights and weekends. 😀

So, one thing we could maybe do which focuses the effort, is stashing away one of the tests organization and use it to upgrade FAST by one full version when we release, documenting the steps.

This might give users a rough path forward and introspection into how changes impact an actual org. It would still be limited, as it would happen only after a major release and ignore the impact of local configurations, but still provide some data.

If this sounds like something that would fit in with the requests above, we might start discussing how to do it going forward. It's pretty easy to do even for an end user TBH, so someone really set on upgrading their FAST install (which they should only for development purposes) could create an org and do the same at any time without waiting for us to do it.

stashing away one of the tests organization and use it to upgrade FAST by one full version when we release, documenting the steps.

This might give users a rough path forward and introspection into how changes impact an actual org

This sounds like a great idea!

I didn't want to make further assumptions, as I’m not familiar with the lifecycle of FAST behind the scenes, but I imagined the team working on this had a minimal FAST organization deployed where new changes are tested, allowing them to easily identify the required upgrade path. However, if the refactors are done starting from a blank state, it would be harder to identify the upgrade path.

By the way, in some cases, simply adding the moved blocks might be easier than documenting the steps. The tradeoff, of course, is the accumulated noise over time, given how quickly FAST evolves. :)

Yep, a moved-v00.00.00.tf block could be added for each version, then it would be easy to add/remove them as needed.

Let's reopen so we keep track of this, sorry for being hard-nosed initially but I think this might be actually doable.

The next version we will release soon will contain moved blocks for key FAST stages (bootstrap/resource management/network a design), and notes to simplify migration from the previous version.

Doing this retroactively will unfortunately be too time consuming, but we plan to do this from now on at every release, using a dedicated organization to verify version migrations to provide high level guidance and moved blocks for critical stages.

PR with the moved blocks and notes is #2541

Thanks a lot for being persistent and getting us to a point where we try and make this a bit simpler to users. It is still not the primary goal of this effort, but if we can lessen the pain a bit we're all for it. :)

PS - The process to defien migrations is time consuming but not terribly complicated, once you have an org dedicated to this. And if Terraform is used for a production org, having a test org (or more than one) on the side for tasks like this should be the default. Like we often tell customers, orgs are free and a vanilla FAST install will cost cents per month. :)