guybedford/import-maps-extensions

Idempotency requirement of the HostResolveImportedModule does not seem to be preserved?

domenic opened this issue · 13 comments

https://github.com/guybedford/import-maps-extensions#proposal-3 says

It is important that the extension process does not break the idempotency requirement of the HostResolveImportedModule hook defined in the ECMA-262 specification.

but its procedure for allowing "imports" extensions, viz.

For "imports", the extension is straightforward - if a lazy-loaded map attempts to redefine an existing property of the import map, we throw a validation error.

does not seem to respect that.

For example:

import.meta.resolve("./x"); // https://example.com/x

await loadSecondImportMap();

import.meta.resolve("./x"); // https://example.com/y

seems perfectly possible with this algorithm, if the first import map contains nothing for ./x and the second contains "./x": "./y".

(It's also possible to run into this for bare specifiers, if e.g. the first contains a mapping for "dep/" and the second contains a mapping for "dep/x". Or vice-versa. But that is probably patchable with a more complicated rule restricting such keys in the map.)

Perhaps it may make sense to only permit URL mappings during the first import map loading phase, banning them subsequently for lazy loaded import mappings?

For mapping things like node:fs we could possibly special case supported v unsupported schemes in this process.

Yeah @hiroshige-g and I were discussing this and came up with a spectrum of options:

  1. Statically disallow URL-like specifiers and bare specifiers that overlap with previously-declared bare specifiers, after initial "acquiring import maps" phase. Pro: very predictable; con: unclear whether the remaining dynamism is actually useful for developers.

  2. Disallow loading an import map that redefines a previously-resolved import specifier. Pro: allows more flexibility; con: introduces possibly-nondeterministic races between any code that resolves/imports modules and code that loads import maps, where a bad race causes failure.

  3. Globally cache already-resolved import specifiers, so that if they are overriden by subsequent import map loads then the import maps is ignored. Example: first import map does dep/ -> 1/, import.meta.resolve("dep/x") -> 1/x, second import map does dep/ -> 2/, then import.meta.resolve("dep/x") -> 1/x still but import.meta.resolve("dep/y") -> 2/y. Pros/cons similar to (2) but losing the race will more often work out fine (but sometimes give subtly confusing contradictions).

  4. Same as (3) but the cache is per-referring-script. This seems most "do what I mean" in that it allows the meanings of specifiers to basically change over time, as long as they are imported/resolved from a script which hasn't previously tried to import them. It's just a bit counterintuitive.

One use case scenario that might be worth considering here is the scenario where there is effectively a large import map backing the entire multi-page application, but that that larger import map was just pruned for optimized delivery on a page load, such that new parts of the import map are loaded dynamically only as needed.

The map in this case would be self-consistent in that new mappings wouldn't redefine old ones, but we likely do need the ability to override previously mapped specifiers with new scopings, so despite the simplicity of (1) I agree this is likely too restricting to be practical.

With (2) and (3) tracking things at the specifier level seems a little too intricate and non-deterministic and it does seem unfortunate that the resolver is no longer just a function of the import map, but a function of the entire history of the resolver as well.

The suggestion I put together in https://github.com/guybedford/import-maps-extensions#1-defining-immutable-import-map-extension was to permit new import map scopes, only when no modules in that scope have been previously loaded at all, which can be done with relatively simple checking of the URLs already in the module registry, and no new data being stored short of optimizing that check.

This approach doesn't work well with extending nested subscoping though or adding new mappings to a scope:

{
  "scopes": {
    "./node_modules/": {
      "node:fs": "/node-fs.js"
    }
  }
}

being extended later on by:

{
  "scopes": {
    "./node_modules/": {
      "node:zlib": "/node-zlib.js"
    }
  }

or even by:

{
  "scopes": {
    "./node_modules/pkg/": {
      "local": "/map.js"
    }
  }
}

Ideally both of the above would be supported since they fall under what might be considered a single consistent mapping, that is just being dynmically extended.

Perhaps one way to achieve the above without tracking would be to say that the bare specifiers can be overridden because they aren't defined at a higher level in the map?

That is, perhaps a simple consistent rule is: any new bare specifier can be added, but if a bare specifier overrides an existing bare specifier, then it must be defined in a scope for which no modules have been loaded at all.

Finally what about this case:

{
  "imports": {
    "x/": "./y/"
  }
}

being extended by:

{
  "imports": {
    "x/z": "./z"
  }
}

Your tracking of individual specifiers is maybe better at dealing with the above, but perhaps it should just be banned so that you either define the mappings all together at the same time or not.

That is, perhaps a simple consistent rule is: any new bare specifier can be added, but if a bare specifier overrides an existing bare specifier, then it must be defined in a scope for which no modules have been loaded at all.

I am unsure how to think about this. It has similar restrictions to my (1), but loosening them inside scopes. But that loosening brings with it the raciness of (2), right? Scenario:

<!-- /index.html -->
<script type="module">
import("/scope/bar.mjs");
loadImportMap("./import-map.json");
</script>
<script type="importmap">
{
  "imports": {
    "bare": "/1.mjs"
  }
}
</script>
// /scope/bar.mjs
import("bar");
{
  "scopes": {
    "/scope/": {
      "bare": "/2.mjs"
    }
  }
}

If the import() finishes first, /scope/bar.mjs executes with "bare" pointing to /1.mjs. If loadImportMap() finishes first, then /scope/bar.mjs executes with "bare" pointing to /2.mjs.

So I think that rule also meets the definition of "non-deterministic and [...] the resolver is no longer just a function of the import map". Or am I missing something?

Strictly speaking I don't think your example is actually racey, since the sync resolver immediately blocks off the "/scope/" scope, but if instead of having import('/scope/bar.mjs') you had import('/app.js') which in turn had import '/scope/bar.mjs' or you had a time delay before the dynamic import you can get the race.

So I think that rule also meets the definition of "non-deterministic and [...] the resolver is no longer just a function of the import map". Or am I missing something?

Right, it's a function of the import map and the set of modules in the module map which determine which scopes are locked, so it's less restrictive than (1), (2) and (3) in allowing scoped versioning over existing bare specifiers, without needing to track every imported specifier using the module map itself as the source of truth of which scopes are blocked.

It's basically the same as (4) actually I think? Perhaps (4) is more flexible though, although would (4) permit a delayed dynamic import in a scope to be different, depending on the timing of that dynamic import? Ie import('bare') could change even if it is made in the same /some/pkg.js depending on when it is run if the scope had changed?

I tend to like the minimal determinism that the parent module being loaded already implies the mappings are locked within that module scope regardless of future timing if that makes sense?

Strictly speaking I don't think your example is actually racey, since

Well I was interpreting your "...for which no modules have been loaded at all" as locking it in at load time, not at resolution time. But yeah resolution time is more natural.

It's basically the same as (4) actually I think?

(4) is extremely flexible. It allows the following:

<!-- /index.html -->
<script type="importmap">
{
  "imports": {
    "bare": "/1.mjs"
  }
}
</script>

<script type="module">

// Notice the awaits

await import("bare"); // imports 1.mjs

await loadImportMap("./import-map-2.json");
await import("./doimport.mjs?1"); // indirectly imports 2.mjs

await loadImportMap("./import-map-3.json");
await import("./doimport.mjs?2"); // indirectly imports 3.mjs
</script>
// doimport.mjs
import("bare");
{
  "//": "import-map-2.json"
  "imports": {
    "bare": "/2.mjs"
  }
}
{
  "//": "import-map-3.json"
  "imports": {
    "bare": "/3.mjs"
  }
}

i.e. bare means completely different things per-referring-module. Scopes aren't involved.

I tend to like the minimal determinism that the parent module being loaded already implies the mappings are locked within that module scope regardless of future timing if that makes sense?

Maybe. I'm unsure how much it buys us, since it's an intermediate where you still have races (based on whether a "parent" module gets loaded faster or slower than an import map). I'm slightly attracted to the extremes, of (1) or (4). They seem relatively easy to explain? But, I'm not firmly in any camp...

As mentioned, the concern with (1) is that it doesn't solve the use case of a deferred partial import map being able to support alternative multi-version branching over the same specifier for another scoping. This is an important property to have for extensibility of import maps in extending that version deduping ability without blocking off package names just because they have been used before (local aliasing will always be a thing).

The example for (4) looks great, I'd be very happy with that personally.

The question I specifically had for (4) was the following (with the same import maps from your example):

await loadImportMap("./import-map-1.json");
await import('bare');
await loadImportMap("./import-map-2.json");
await import('bare');

Would the above load a different resolution for bare the second time?

Alternatively, what if instead it used:

await loadImportMap("./import-map-1.json");
await import('bare-1');
await loadImportMap("./import-map-2.json");
await import('bare-2');

Where import-map-1 maps bare-1 and import-map-2 maps bare-2?

Finally, what if import-map-1 mapped both bare-1 and bare-2 in the above?

Would the above load a different resolution for bare the second time?

No. The same module always sees the same result. (Note that in my example, doimport.mjs?1 and doimport.mjs?2 are different modules.)

Alternatively, what if instead it used:

This would work.

Finally, what if import-map-1 mapped both bare-1 and bare-2 in the above?

This would definitely work, and already works today? Not sure I'm understanding the question.

In the final case, import-map-1 maps bare-1 and bare-2, and import-map-2 maps only bare-2 but to a different resolution than what bare-2 was mapped to in import-map-1. So the question is which resolution would bare-2 get, the first or the second? I take it it would get the second resolution for bare-2 since it hasn't resolved it before?

Would import.meta.resolve also participate in populating this cache?

In the final case, import-map-1 maps bare-1 and bare-2, and import-map-2 maps only bare-2 but to a different resolution than what bare-2 was mapped to in import-map-1. So the question is which resolution would bare-2 get, the first or the second? I take it it would get the second resolution for bare-2 since it hasn't resolved it before?

Ah, I see. Yes, it would get the second resolution for bare-2.

(Although, I guess this is all a good reminder that any progress here is predicated on first defining an import map merge algorithm. We'd probably first want to spec and finish that with the existing "acquiring import maps" lockdown semantics, since it's separable.)

Would import.meta.resolve also participate in populating this cache?

Yes, I think it would have to.

Right, so what I was originally proposing were some restrictions on the merge algorithm, which effectively add some restrictions to (4):

  • Any new imports or scopes can be extended in.
  • Existing mappings in "imports" cannot be replaced - only new mappings added.
  • When overriding a specific scope, an existing mapping also cannot be replaced and only new mappings added.
  • Both of the above to also apply to sub-mappings - ie if there is a mapping for pkg/, you cannot add a mapping for pkg/x/ in either "imports" or a specific scope.
  • And of course, combined with URL mappings not being permitted at extension time.

I'm open to completely unrestricted merging, or having some merging rules. My original intuition was that stronger merge restrictions could help ensure better determinism.

The final rule about scope restrictions was then effectively another merge restriction for locking down inter-scope determinism:

  • A new scope cannot be added if any existing successfully linked modules in the registry exist within that scope already