google/blockly

Clean up case-sensitivity in registry

Opened this issue · 0 comments

Check for duplicates

  • I have searched for similar issues before opening a new one.

Problem

Registry

The registry is the central object that allows developers to replace various parts of the blockly ui or api. It's how we managed to do dependency injection sort-of when the library wasn't originally built for you to be able to swap Toolbox classes, for example. You can register replacement or additional classes for the toolbox, toolbox items, flyout, metrics manager, serializers, and more.

Case sensitivity

Every item in the registry has a type and a name. The type is used by blockly to know what sort of item you are registering (e.g. toolbox vs flyout). We'll always use this in a case-insensitive manner. If you say you are registering a tOoLbOx type, we'll get whatever you registered when we look it up under toolbox.

The name is treated a bit differently. Usually, we'll treat the name as case-insensitive, but we will also remember the case of the name you give us. So you can't have two different toolboxes named foo and FOO. We'll overwrite the first one with the second one you give us. If you ask us for a toolbox named FoO, we'll give you the toolbox you registered regardless of the case of the name you registered it under.

When the registry is asked for all items of a certain type, a parameter can be passed to determine if what is returned is the list of registry items in all lowercase, or with the case as most recently registered under. Users outside of Blockly are unlikely to call this function. The only internal to Blockly use of this function is currently in the serialization system. All registered serializers are requested so they can all be used during serialization and deserialization. When the serialization system calls this function, it does pass the opt_cased parameter as true so that the serializer uses the last-registered cased name of the item. If you registered a custom serializer under the name Foo then the serialized output of the workspace will use the name Foo in the same case you registered it in.

The net result of this is that the registry is case-insensitive when determining what items exist and when getting a singular registry item. But you can ask for the list of all items registered, retaining their original case.

Request

The current state of case-insensitivity in the registry may be confusing. Neil proposed that we either make the registry fully case-insensitive or fully case-sensitive.

Option 1: Fully case-insensitive

  • In this option, we would remove the parts of the registry (i.e. nameMap) that keep track of the actual cased name an item was registered under. We'll always return the normalized name, i.e. lowercased.
  • This could be a severe breaking change because (depending on the implementation) it would mean that saved serialized workspaces would not load. Workspaces serialized under the old system could have output like customSerializer: {...} but the new system would expect to find the output under the name customserializer instead.
    • We should never break developers' serialized output.
    • To mitigate this, the serialization system can become case-insensitive, and when it encounters output like customSerializer it will always lowercase the name before looking for the serializer.
    • This is still a breaking change for two reasons:
      • The output of serializing a workspace will change. In workspaces serialized under the new system, the output would now include customserializer where the old system would say customSerializer. Developers may have custom code that reads the JSON that they will need to adjust to account for this.
      • Output serialized under the new system may not be able to be deserialized under the old system. That would make it impossible for a developer to downgrade their version number in case a critical incompatibility is found.

In my opinion, messing with developers' serialized output and save data is always risky and we should avoid doing so if possible. That makes this option untenable.

Option 2: Fully case-sensitive

  • In this option, we would remove the parts of the registry (i.e. nameMap) that keep track of the actual cased name an item was registered under, and we won't need to lowercase the name stored in typeMap. Instead, the typeMap can just directly use the name passed in during registration.
  • This is a breaking change because people using the registry may be relying on the fact that the registry was not previously case-sensitive. e.g. they may be registering something under the name Foo and trying to retrieve it under the name foo.
  • In order to deal with this breaking change, developers would have to update their code to also be case sensitive.
  • Since it is already the case that you cannot have two different items of the same type registered under the names Foo and foo, we will either need to keep that requirement and throw an error if such a thing is attempted, or relax this restriction. If we keep the requirement, then we still need to keep track of the all-lowercased version of the name separately from the name the user passed in, so the code is not really all that much simpler than the code we have today.

In my opinion, the version of this currently done in #7924 is not that much simpler than the pre-existing code due to the fact that it still has both the typeMap and the noCaseNameMap.

If we decide it is worth doing breaking changes here, I would prefer this option. But I would prefer that we simplify things further than in #7924 and just store the name in the case the developer gave it to us, and eliminate the two-map system.

Option 3: No breaking changes, minor cleanup only

Alternatively, we could decide the way the registry is currently built is fine, or at least not worth introducing breaking changes. There may still be some cleanup we could do to make the situation easier to understand, like moving the beginning of this issue to the documentation site as explanation, introducing a normalize method that can be called instead of currently we call toLowerCase in most places but in one place we also trim the string, etc.

Alternatives considered

No response

Additional context

See some discussion in #7924 , #7982, and #7983