bird-house/birdhouse-deploy

:question: [Question]: Can we change PAVICS to Birdhouse everywhere

Closed this issue · 8 comments

Questions

As we're writing documentation for Marble (which uses the birdhouse software), it seems strange to describe configuration and deployment instructions using language that refers specifically to PAVICS.

I would really like it if we changed references to "PAVICS" in the documentation, code, etc. to "Birdhouse".

Unfortunately, to do this fully, this will require some major changes for existing deployments. This is because PAVICS is currently in variable names and file names which means that we'll need to update env.local files and CI scripts.

For example:

  • references to PAVICS_FQDN will need to be changed to BIRDHOUSE_FQDN
  • scripts that call pavics-compose.sh will need be changed to call birdhouse-compose.sh

I understand that this is a going to be a big pain to change for existing deployments so I propose we do one of the following:

  1. change PAVICS to Birdhouse everywhere and do the hard work of updating the current deployments where needed
  2. change PAVICS to Birdhouse everywhere and leave in a bunch of redundant environment variables and symlinked files so that existing deployments will work with the old setup. Then we can take our time changing things over and officially drop support for the old variable and file names in version 3.
  3. add a disclaimer to the documentation saying that for legacy reasons, some of the files use the word PAVICS (but don't worry about it).
  4. some other option

Obviously, option 3 would be the easiest to implement but options 1 and 2 would improve the code base (in my opinion).

I'm interested to hear what people think of these options.

@huard @fmigneault @tlvu @eyvorchuk

References

Environment

Information Value
Server/Platform URL
Version Tag/Commit
Related issues/PR
Related components

Conceptually, I agree it should be renamed to birdhouse.

I was actually in the process of adjusting/adding some override variables to replace items that were explicitly enforced to Ouranos/PAVICS details for all instances, specifically :

'institution': 'Ouranos',
'researchSubject': 'Climatology',
'supportEmail': '${SUPPORT_EMAIL}',
'category': 'Resource/Cloud Management',
'tags': ['Climatology']
},
'stats': {
'method': '.*',
'route': '(?!)' # this will be set by CANARIE_STATS_ROUTES (see below)
},
'redirect': {
'doc': 'https://pavics-sdi.readthedocs.io/en/latest/arch/backend.html',
'releasenotes': 'https://github.com/bird-house/birdhouse-deploy/blob/master/CHANGES.md',
'support': 'https://github.com/bird-house/birdhouse-deploy/issues',
'source': 'https://github.com/bird-house/birdhouse-deploy',
'tryme': 'https://${PAVICS_FQDN_PUBLIC}',
'licence': 'https://pavics-sdi.readthedocs.io/en/latest/license.html',
'provenance': 'https://pavics-sdi.readthedocs.io/en/latest/provenance/index.html'
},

Preferably, those would be adjusted at the same time to avoid opening the configs multiple times.
I was trying to figure out a way to make these values defaults with "STRONG suggestion/warning" to override them, but if the choice of this discussion is to introduce breaking changes with birdhouse renames, I would prefer to update these simultaneously to required VARS.

I would like to propose a derived approach in between 1-2 regarding the variables.
IMO, we should explicitly replace all PAVICS_ variables to BIRDHOUSE_ in the code.
However, the core default.env would define them like so:

export BIRDHOUSE_FQDN="${PAVICS_FQDN}"

Since (current) pavics-compose.sh checks these required VARS to be non-empty, it does not matter if they are overridden directly or indirectly by PAVICS_... names. There would not be duplicate variables in this code base except these default expansions. Only deployments that explicitly set PAVICS_... in their env.local would see duplicate variables for equivalent concepts.

I am also OK with option 1 (for variables) if everyone agrees. Setting 2-3 extra vars in our env.local is not much extra work.

The biggest issue IMO is the pavics-compose.sh file. Within the code, a BIRDHOUSE_COMPOSE could be provided to reference/override the name as needed and avoid future issues with hard coded pavics-compose.sh references. The actual file name is not that important in that case. The main pain point is external scripts such as the CI pointing directly at pavics-compose.sh. Doing a search-replace of pavics-compose.sh or generating the symlink during cloud-init is equivalent (in terms of work/validation time) for me.

I was trying to figure out a way to make these values defaults with "STRONG suggestion/warning" to override them, but if the choice of this discussion is to introduce breaking changes with birdhouse renames, I would prefer to update these simultaneously to required VARS.

Do you mean that you'd like to include these changes in #415 ? That's fine with me

I am also OK with option 1 (for variables) if everyone agrees. Setting 2-3 extra vars in our env.local is not much extra work.

I would prefer this since it seems like minimal one time effort to update the env.local file. But, I'm also ok with this strategy if @tlvu @eyvorchuk would prefer:

I would like to propose a derived approach in between 1-2 regarding the variables.
IMO, we should explicitly replace all PAVICS_ variables to BIRDHOUSE_ in the code.

Ok, now about pavics-compose.sh:

The biggest issue IMO is the pavics-compose.sh file. Within the code, a BIRDHOUSE_COMPOSE could be provided to reference/override the name as needed and avoid future issues with hard coded pavics-compose.sh references.

I like this idea I lot.

Another thing we can do as well is to fix and extend the make start and make stop commands with something like make compose (that would then pass on any additional argument to the actual compose file).

That was we can recommend any external scripts in the future to just use the make commands.

Doing a search-replace of pavics-compose.sh or generating the symlink during cloud-init is equivalent (in terms of work/validation time) for me.

I agree, I'm willing to make this update once to external scripts.

Another thing we can do as well is to fix and extend the make start and make stop commands with something like make compose (that would then pass on any additional argument to the actual compose file).

From experience, I have found that passing "additional arguments" via make is sometimes more troublesome or unnatural for inexperienced users. For example, to replicate up -d, you have to do something like:

make COMPOSE_XARGS="up -d" compose
# or 
COMPOSE_XARGS="up -d" make compose

With the target that does:

compose:
    @$(SHELL) $(SCRIPT) $(COMPOSE_XARGS)

For simple uses like up -d, it is acceptable. For more complicated values in COMPOSE_XARGS that could require various quotes, it can become an escape nightmare. I'm not against having it available, though.

tlvu commented

I have no problem with renaming variables in env.local since the new names can be easily added ahead of time to env.local and the transitions will be completely seamless.

The problems I see is with external repos that hooks into the platform and that use those variables. Ex: the default.env or docker-compose-extra.yml of those external repos use PAVICS_FQDN for example.

All the config variables in the various default.env and env.local are like the stable interface between the platform and external components.

To ease the transition, I suggest to set all the old names even if they are not used in the platform anymore. So at the end of read-configs.include.sh we can create a new function, something like allow_backward_compat(), and in there we restore export PAVICS_FQDN="$BIRDHOUSE_FQDN" and so on for all the old PAVICS_ ... names.

As for the script pavics-compose.sh, how about creating a symlink pavics-compose.sh -> birdhouse-compose.sh to keep backward-compat with external repos that we have no control over? Creating BIRDHOUSE_COMPOSE var to ease future rename is a good as well. Is this the only script having pavics in the name? How about we avoid using platform name specific in scripts and just call it manager.sh or compose-wrapper.sh or something generic that designate its role only?

PAVICS was the name of a previous funding project (currently DACCS). We learned our lesson and did not use DACCS anywhere internally. Should we also learn the lesson as well and limit using platform names (Birdhouse) in the "public API" (config var and entrypoint scripts) to protect us from future branding rename? It's the same rename effort, might as well choose new names that will prevent the need for future renames.

I agree with @fmigneault that using make is awkward when invoking scripts. However, for workflow that involves calling multiples scripts in a sequence, make can be useful to capture the various steps. So no objection to add make support but it should not be the only way to interact with the platform.

To ease the transition, I suggest to set all the old names even if they are not used in the platform anymore.

Yes I think this is a good idea

As for the script pavics-compose.sh, how about creating a symlink pavics-compose.sh -> birdhouse-compose.sh

I'm happy to do this but you'll have to make sure that your scripts can handle a symlinked file as expected. It might be just as much effort to change the references in your scripts.

Is this the only script having pavics in the name?

There is also:

  • PAVICS-deploy.logrotate (this one is used by the scheduler component and should be easy to update)
  • trigger-pavicscrawler (this one is deprecated so I'm not worried about changing it)
  • configure-pavics.sh (this is called by provision.sh but maybe it's called by external scripts as well? ... I don't know because I'm not using vagrant)

Should we also learn the lesson as well and limit using platform names (Birdhouse) in the "public API" (config var and entrypoint scripts) to protect us from future branding rename?

I don't think this in necessary. The name of the software in this repo is "Birdhouse", that is different from a project that uses this software (e.g. PAVICS, DACCS, Marble, etc.). So I'm ok if variables and files in this project refer to "Birdhouse" as long as it doesn't refer to any specific project that uses the software.

So no objection to add make support but it should not be the only way to interact with the platform.

Ok I'm happy with this

Thank you @tlvu and @fmigneault for your feedback! Based on the discussion above I'm going to propose the following plan. Let me what you think:

  1. Change all PAVICS_ variables to BIRDHOUSE_
  2. Rename all files with PAVICS in the name to birdhouse
  3. recreate pavics-compose.sh that displays a deprecation warning and then calls birdhouse-compose.sh
    • calling pavics-compose.sh will also call the allow_backward_compat() function that redefines PAVICS_ variables for any external component that uses them.
  4. Create a migration guide that describes how to update env.local and any external scripts to support the new naming conventions
  5. make sure that the make start and make stop commands work with the new conventions (but don't add any more make commands for now)

For point 3, I suggest reusing the warning strategy introduced in #415. A slightly different messages could be provided when detecting PAVICS_ names.
@tlvu pointed out that other variables such as default passwords could be added in those warnings. I think this is a good idea and we should combine these warning mechanisms as much as possible to avoid duplicating logic.

Sounds good to me for the rest of the proposal.

we should combine these warning mechanisms as much as possible to avoid duplicating logic.

I agree.

I can wait until #415 is finished and then add another PR using the strategies you introduce there