How should the loader impact <script type="module">?

Question

How should the loader impact <script type="module">?

domenic opened this issue 9 years ago · 10 comments

Forking from whatwg/html#443 (comment) and the next two comments

Can you say more? Why?

src has an established meaning for web developers, having to do with URLs, not module specifiers. Breaking that mental model is not really acceptable. For example, <script src="jquery"></script> fetches http://example.com/base/url/jquery; <img src="jquery"> fetches http://example.com/base/url/jquery; and so <script type="module" src="jquery"> must also. But see below, maybe I didn't quite understand what you meant...

More concretely, an inline module (no src attribute) would go through the loader starting at the translate hook; an out-of-band module (with src attribute) would go through the loader starting at the fetch hook. IOW the point of my PS above was that both inline and out-of-band <script type="module">s should go through the loader.

HMMM. Overall this seems reasonable. A few things:

Did you intentionally omit resolve? So resolve does not apply to src? That is good. That means src at least uses URLs, not module specifiers, solving my top complaint. (If you want to intercept resolve, then my suggestion of a second attribute comes in. Or just use <script>System.loader.import(...)</script>; it seems about the same number of characters.)
The way of overriding the default fetch behavior in the browser is to use service worker. Adding a second way of intercepting fetches to the web platform---which runs on the main thread, no less---is going to need a lot more discussion. Maybe that is a separate issue for the browser loader spec to work out, but it's a big one that I haven't seen an issue for yet. Heck, maybe you set up the browser loader inside your service worker, using specialized APIs on the FetchEvent to pass it to a loader instance, or something.
Running translate and instantiate on all <script type="module">s makes sense. It's basically building a framework into the browser for allowing custom execution of <script type="text/x-transcode-me">...</script> to be built by supplying two functions (translate/instantiate). This framework is the high-level API that would normally have to be expressed using mutation observers + shadow DOM + some probably-complicated dependency management logic, as-is done today. If there's implementer interest in building a framework for this use case, then translate/instantiate on <script type="module"> seems like a reasonable path (although again, maybe service workers would fit more with the platform, since that's where people will be transcoding multimedia and other response bodies).

Let me try to say it a little differently: my principal concern with not allowing the loader to participate in the hooking of script/module tags is that <script type="module"> becomes an incomplete story for how to kick off an app. Devs would need to learn that if they want loader integration, they have to write a top-level wrapper script. And if that's the case, then <script type="module"> is less universally reliable than just always kicking off apps with dynamic APIs, à la:

Yeah, I get that, at least for apps which need to globally customize translate/instantiate behavior.

On the other hand, I'm not sure <script type="module"> is really aimed at developers who want custom loaders. I would think such developers would, well, use their custom loader. That is, you seem to be proposing that such developers will do:

<script>
System.loader = class JSXLoader extends System.Loader { ... };
</script>

<script type="module" src="foo.jsx"></script>
<!-- or is it loadfromspecifier="./foo.jsx"? -->

whereas it seems more likely to me that they will do:

<script type="module">
const jsxLoader = new class JSXLoader extends System.Loader { ... };
jsxLoader.import("./foo.jsx");
</script>

or perhaps

<script type="module" src="https://cdn.example.com/jsx-loader.js"
          data-start="./foo.jsx"></script>

which is just generic sugar for the above, given a sufficiently well-written jsx-loader.js. This seems more compositional, messing with less global state, and more transparent as to what's going on; it only affects the module trees you explicitly import that way. It seems more likely to work in a world of third-party scripts.

What do you think?

Answer 1 · 2016-01-06T01:45:35.000Z

Did you intentionally omit resolve? So resolve does not apply to src?

Oh, absolutely -- that's right, there's no resolve since there's no module name, and src always means URL.

The way of overriding the default fetch behavior in the browser is to use service worker. Adding a second way of intercepting fetches to the web platform---which runs on the main thread, no less---is going to need a lot more discussion.

Sure, although that's inherent to having a fetch hook in the loader API at all. I agree that there's overlap, and SW is mostly more general. It may well be the case that the fetch hook is less critical than the others, esp. the translate hook, since you can hook fetches with SW. (I'd frankly even be up for considering eliminating the fetch hook from the entire loader API.) The important thing is for <script type="module"> to be a full participant in the language semantics, which is hooked by the loader.

This seems more compositional, messing with less global state, and more transparent as to what's going on; it only affects the module trees you explicitly import that way.

Yeah, but examples like polyfills actually depend on global state that everyone needs to share (for example, shared mutations to builtins, or a runtime shared between all compiled code). And these modifications want to be coalesced in one place so that other code can be written under the assumption that they are operating in the proper environment (for example, an ES7 environment, whether provided by the browser directly or by a polyfill). Such global mutations are for sure less compositional, which is why an app needs to install them in one single place at the top level. All the other code just operates under the assumption that the global heap is in the appropriate state.

or perhaps <script type="module" src="https://cdn.example.com/jsx-loader.js" data-start="./foo.jsx"></script>

These examples are definitely plausible. The thing I'm aiming for is to try to get as close as possible to a programming model where you can pretend <script> without type="module" doesn't exist, or at least eschew it within individual apps or "house styles." (This is part of why I really would like us to work on a way to get to some day as an alias for <script type="module">, so the ergonomics are as sweet as <script>. But I know there are major parser/security challenges with that.)

Now, I'll grant that installing a polyfill with <script src="babel-polyfill.js"> already violates that goal. :-/ I'm definitely open to alternatives. Your suggestions are interesting, especially because they don't use <script> at all, but the part that bothers me is that I don't want to disallow having multiple scripts in a page, or require putting all relevant script in separate files. Keep in mind that skipping the loader means you can never have, say, inline ES8 transpiled via Babel, or inline transpiled WebAssembly.

Answer 2 · 2016-01-06T01:46:26.000Z

(Brief aside: @wycats and I have talked about this general space as "the Web's staging challenge" but had a hard time articulating it crisply. Systems like SW and module loaders are about reflectively modifying the semantics of a web app from within the app itself, so there needs to be a way to indicate that the modifications happen in a stage prior to the processing of the code that depends on those modifications. Throw in the performance pitfalls of blocking and it's a pretty subtle space. The nice thing about the module system and the loader is that they're designed to be thoroughly asynchronous. But the place where the rubber meets the road is exactly what we're talking about here: how, where, and when you get to insert your modifications into HTML.)

Answer 3 · 2016-01-06T02:02:18.000Z

We might really want to think about SW integration, BTW. Given that it's made more headway on the "first boot" phenomenon than anything, maybe loader/SW integration could be a more fruitful tack.

Answer 4 · 2016-01-06T02:04:38.000Z

@dherman @domenic I agree. I think that having the SW be responsible for early-stage extensible-web extensions (like new packaging formats, the loader, etc.) is the most likely way to let applications "set up the universe" before their app boots.

SW's first-boot story isn't amazing yet, but it's eminently abstractable, and once you abstract it once the same solution will work for all kinds of extensions.

(in this case, you can imagine an SW hook for "give me the loader configuration")

Answer 5 · 2016-01-06T02:26:06.000Z

whereas it seems more likely to me that they will do:

<script type="module">
const jsxLoader = new class JSXLoader extends System.Loader { ... };
jsxLoader.import("./foo.jsx");
</script>

You think people are going to inline a loader into a script tag rather than use a prebuilt one? I see this scenario happening very rarely, for page-specific needs only. Anything generic like a JSX loader will surely be distributed.

Which brings up a concern in regards to the defer semantics, consider this:

<script type="module" src="jsx-loader.js"></script>
<script type="module" src="app.js"></script>

Saying that app.js depends on loader hooks installed by jsx-loader.js this means the complete tree of jsx-loader.js must be executed before app.js can do anything at all.

This stinks, so I'm ok with saying that loader hooks must be installed before any type=module scripts; meaning that must be added via a sloppy script tag.

Answer 6 · 2016-01-06T02:31:43.000Z

Oh, absolutely -- that's right, there's no resolve since there's no module name, and src always means URL.

Note that there are no module names in the WHATWG Loader spec, period, only module identifiers and urls (keys).

How are these "anonymous modules" registered in the Loader registry?

<script type="module">
  import $ from 'jquery';

  $(function() { ...
</script>

Or is this module not in the registry? If so, what is the value of referrer in the resolve hook for when jquery is imported?

Maybe they could be assigned a Symbol as the key so that document.querySelector('[type=module]').key allows you to get the module key or something to that effect. Although I think currently the Loader expects keys to be strings.

Answer 7 · 2016-01-06T17:47:22.000Z

@matthewp:

Oh, absolutely -- that's right, there's no resolve since there's no module name, and src always means URL.
Note that there are no module names in the WHATWG Loader spec, period, only module identifiers and urls (keys).
How are these "anonymous modules" registered in the Loader registry?

A module doesn't need to be part of the registry, in which case, the key is irrelevant (could be anything). In the case of inline modules, we can simply create a new source text module record bound to a loader instance (a loader back-pointer is required to load dependencies but the module will not be added to the corresponding registry automatically), and then evaluate it.

Or is this module not in the registry? If so, what is the value of referrer in the resolve hook for when jquery is imported?

This is TBD, but last time we spoke about this, the page base url was sounded, but since that value will not be an entry in the registry, undefined might be a better option. Again, this is TBD.

Although I think currently the Loader expects keys to be strings.

No, it is just Let keyString be ? ToString(key)..

Answer 8 · 2016-01-07T23:23:05.000Z

@dherman

And these modifications want to be coalesced in one place so that other code can be written under the assumption that they are operating in the proper environment (for example, an ES7 environment, whether provided by the browser directly or by a polyfill). Such global mutations are for sure less compositional, which is why an app needs to install them in one single place at the top level. All the other code just operates under the assumption that the global heap is in the appropriate state.

I don't think this is necessarily true, at least for transpilation. There it seems better to explicitly annotate which code runs in the transpiled environment. I can't imagine a case in which this is not under your control; if you are including the script, you can include it within a given loader scope.

I agree that global modifications to the built-ins must be shared, but those don't have much to do with the loader---any old <script type="module">, without a loader at all, can perform those modifications.

Another idea I liked was

<script type="module" loader="path/to/loader.js">
import "./goes-through-loader.jsx";
import "./more/files/through-loader";
</script>

These examples are definitely plausible. The thing I'm aiming for is to try to get as close as possible to a programming model where you can pretend <script> without type="module" doesn't exist, or at least eschew it within individual apps or "house styles."

Definitely agreed. I don't think there's anything preventing that though.

but the part that bothers me is that I don't want to disallow having multiple scripts in a page, or require putting all relevant script in separate files. Keep in mind that skipping the loader means you can never have, say, inline ES8 transpiled via Babel, or inline transpiled WebAssembly.

I guess this is addressed by my <script type="module" loader="path/to/loader.js"> idea.

@matthewp

You think people are going to inline a loader into a script tag rather than use a prebuilt one? I see this scenario happening very rarely, for page-specific needs only. Anything generic like a JSX loader will surely be distributed.

No, of course not; that's why I gave a second example. Or do import JSXLoader from "https://cdn.example.com/jsx-loader.js".

Which brings up a concern in regards to the defer semantics, consider this:

Yes, that seems clear. See whatwg/html#443 (comment) and whatwg/html#443 (comment). I am thinking you can opt-in to allowing out-of-order nondetermnistic execution by adding the async attribute.

Note that there are no module names in the WHATWG Loader spec, period, only module identifiers and urls (keys).

It keeps changing. At one point it was identifiers; these days I think it's specifiers. I think we can excuse @dherman for not using the exact right noun.

Answer 9 · 2016-01-11T13:02:15.000Z

Just to comment with respect to Service Worker integration, in browsers that support Service Worker (with a good "first boot" story), surely the fetch and translate hooks are then effectively already provided by the service worker hooks itself?

That is, if the <script type="module"> tag is not going to respect fetch and translate from the loader, then there would be no need for them to be included in the loader spec surely?

Answer 10 · 2016-01-12T19:35:13.000Z

Service Workers provide a way to globally override a page's fetching. In multiple loader scenarios you need to know the loader that is performing the fetch in order to apply correct overrides (for example our JSXLoader might use more aggressive caching than the default System.loader would).

Aside from that it would be good to keep this specification portable to other runtimes than just the web browser.