tc39/proposal-import-attributes

Define what a type is

xtuc opened this issue Β· 29 comments

xtuc commented

Currently the proposal seems to imply that the type strictly matches with a mimetype: json will check for the application/json mimetype for intance.

However, there are cases where it might be too specific:

  1. BinAST has the mimetype application/javascript-binast while JavaScript has application/javascript. Some browser can support both depending what the server sends.
    For instance:

    import u from a with type: "javascript";

    Should allow both a BinAST encoded and a JavaScript to be imported.

  2. Similarly images; it's common for a web server to send different image format or quality depending on the client's request (based Accept header or similar).
    For instance:

    import u from a with type: "image";

    Should allow image/jpeg or image/webp in the response mimetype.

The interpretation of type should be something like:

  • If type is image, then
    1. assert image/jpeg, image/webp
  • If type is javascript, then
    1. assert application/javascript, application/javascript-binast

Note, the definition of a JSON MIME type is also complex, see whatwg/mimesniff#112 for current discussion.

Yeah, seems like each each module type will have to be defined such that it maps to a set of accepted MIME types, which may overlap with the set of accepted types for a different module type.
The overlapping part is important to note for future-proofing. Today we have JSON modules, and say that in the future someone wants to introduce (as a random example) "image" modules where an import results in a canvas or something. Say also that there is some image format that is also valid JSON, which has a MIME type like image/foo+json. Then, we would want both of the following to work for a given resource of type image/foo+json:

import img from "./my_resource" with type: "image";
import img from "./my_resource" with type: "json";

where the former yields a canvas/bitmap/whatever and the latter yields the corresponding JSON object.

XML/SVG are a current example where a resource can be interpreted as multiple types.

xtuc commented

We could specify sensible defaults, like image being image/foo (not json) and if the browser has an ambiguous resolution emits a warning. The developer could clarify with a new attribute:

import img from "./my_resource" with type: "image", as: "json";

Moreover, bundlers usually have multiple loaders that can include an image; where some will do it as a json string, blob or an URL. The developer could clarify as well with the attribute:

import img from "./my_resource" with type: "image", as: "raw";

There would need to be a separate mapping of all of the types to their associated mimes. Seems like a lot since this would need to live somewhere and who would be responsible for updating it? Wouldnt that cause us to have to version it so it doesnt break sites? Although it seems like a rare use case, is there any reason the author couldnt just use multiple mime types they would accept in the import statement?

Some of these semantics get web-specific. I wrote up my thoughts on the interaction with the web at #24 .

Is there value in making the type more generic? The user just says that they don't expect execution of code from this resource, making it something like this:

import img from "./my_resource" with type: "data";
import img from data "./my_resource";

Otherwise it feels like it devolves into a very verbose duplication of information already provided elsewhere that could encourage people to not use this pattern, instead opting for shipping the data embedded in JS because it's "easier to use that way".

It definitely seems to me like the only thing motivating this proposal is β€œdoes the user expect an import to execute in the JS env, or not” which sounds to me like a boolean condition instead of a silly complex type, or β€œany data”.

xtuc commented

We don't need a complex type, but there are cases more complex than "does the user expect an import to execute in the JS env, or not".
For instance, if you import a CSS module and the server mistakenly sends it with a text/html content type. The developer needs a way to tell the host to interpret it as a CSS module or fail.

Edit: we have data about ContentType vs file extension mistmatch https://github.com/littledan/proposal-module-attributes/blob/master/content-type-vs-file-extension.md.

For instance, if you import a CSS module and the server mistakenly sends it with a text/html content type. The developer needs a way to tell the host to interpret it as a CSS module or fail.

I would argue that it is way more likely that the server accidentally makes a schema change in the CSS or JSON data that breaks the importing module. And I wouldn't expect the module syntax to protect from that. If the server sends unexpected responses, the program will break in unexpected ways. I don't think any amount of syntax will prevent that.

@xtuc I’m not sure why - if it works, it works; if it doesn’t, it doesn’t. Why should the consuming code pay a tax that tightly couples it to implementation details of a specifier?

@jkrems That's definitely a possibility. In WICG/webcomponents#839 (comment) , @rniwa suggested that separating types further to avoid parser misuse would be preferable, but earlier in the thread, there are simpler noexecute ideas presented. The TC39 proposal can work to provide the syntactic basis for either option, as you showed.

Are custom types possible, and if not, would it require a breaking change to add support (future-proofing)? For example:

import "Foo" form "foo" with type: "Mustache"
import "Bar" form "bar" with type: "ReactComponent"

Background

This proposal looks similar to "plugins" in RequireJS which allowed you to write loader plugins that can load different types of resources as dependencies.

In RequireJS syntax was "!", and you could for example load JSON like this:

require("foo!json", "bar!json", function(foo,bar) {
  // foo and bar are json objects
});

Here is the list of the plugins that were at one point supported:
https://requirejs.org/docs/plugins.html

The pluging's role was to parse the source text and return a type. A type could be a template (Jade, Handlebar, Mustache, etc) or even HTML.

xtuc commented

The host could allow to add custom types, because types are interpreted by the host. As you mentioned in another issue, if we want to ensure future proof types I guess it's not recommended.

Is the ability to polyfill a concern? If the host doesn't allow custom types, an application would be unable to handle the absence of mustache support gracefully in a browser that doesn't ship it yet (using mustache as a placeholder for "future format").

I think, within build tools, it should be possible to define a custom type. On the web, as a native feature, this would require that some JavaScript code runs to set up the interpretation of these types before the modules load. This becomes more of a research problem, similar to ServiceWorker on first load.

"ServiceWorker on first load" seems to be planning to ban TLA in SWs; i'm not convinced that direction will be a feasible solution in a general sense.

@ljharb I don't understand the connection between these two things. Those are just two times ServiceWorker came up in TC39.

In #3 (comment) , there is a suggestion that we could ask people to write MIME types rather than these more abstract types. This was my first intuition as well, but I think the current direction of higher-level types is good because:

  • MIME types are more complicated than just checking that strings are equal (there's a parameter syntax, several JS MIME types, JSON is defined by a sort of suffix of the MIME type, etc)
  • It makes more sense across environments (can be interpreted as either file extension or MIME type, depending on environment)
  • Higher-level types are just shorter and easier to type and remember!

Higher-level types could be based on MIME type groups, which defines the following groups

  • image
  • audio or video
  • font
  • ZIP-based
  • archive
  • XML
  • HTML
  • scriptable
  • JavaScript
  • JSON

Perhaps new MIME type groups would be needed for CSS or stylesheet, and for WASM.

bmeck commented

One thing that is valuable is for personal/vendor specific types to be clearly defined. Using shortnames like mime groupings from WHATWG would give us a central registry to coordinate across environments, but doesn't really specify how to specify non-standard extensions. MIME types themselves have prs., vnd., and x. for non-standard types as well as a means to register new types on a standards track. I think it may require some more typing but there is certainly more flexibility for non-standard usages as well as a clear means for non-web usage to expand a standard set of known types if we use MIME.

An idea could be to define two properties (e.g. type and group).

import foo from "./foo.odt" with {type: "application/vnd.oasis.opendocument.text"};
import foo from "./foo.sid" with {type: "audio/prs.sid"};
import foo from "./foo.bar" with {type: "video/x.bar"};
import foo from "./foo.json" with {group: "JSON"};
import foo from "./foo.js"; // the default could be with {group: "JavaScript"}

A drawback to having multiple properties is that this may be considered over engineered, confusing, or 'silly complex'.

xtuc commented

if we want the as "json" syntax to be consistent with the with {type: "json"} then it would need to be only one value.

why should we specify the meaning of attributes instead of letting the host(Web, Node, etc...) decide it?

The moduleSpecifier is a normal string and the platform can decide its module resolution, the module attribute should work in the same way.

+1 to what @Jack-Works said. My impression was that tc39 would specify the mechanism, but similar to specifier strings it would be up to the host to decide semantics

My impression was that tc39 would specify the mechanism, but similar to specifier strings it would be up to the host to decide semantics

Given the years of confusion around specifier strings, to me that sounds like an argument for TC39 to have stronger opinions on the value of the attribute.

You can see the current draft spec for a definition of type:

  • It must be a check (e.g., for the file extension or MIME type), and not part of the cache key; use a different attribute key if you want to reinterpret the same resource
  • type: "json" must be a single default export, which is the output of JSON.parse for some host-provided string
  • In the case of JSON modules in HTML, the check is checking that the subresource has a JSON MIME type (which is not just one type but many possibilities, see whatwg/mimesniff#112)
  • Hosts may provide interpretations of their own types, or their own attribute keys
    • The ECMA-262 spec continues to leave interpretation of specifier strings up to hosts, in general. Some hosts may try to align more or less with each other here (e.g., there are some ways that Node and the Web are similar in terms of treating module specifiers as URL-like things, where Moddable XS differs). This lack of definition extends to other attribute keys or type values that ECMA-262 does not define.

Does this resolve the issue? Do folks have concerns with these answers? The decisions here are pretty fundamental, so I'm tagging this issue as "Stage 2".

I think that's a pretty complete definition that should answer the important questions. So for the purposes of this issue, this feels enough to close it.

  • Hosts may provide interpretations of their own types, or their own attribute keys
    • This lack of definition extends to other attribute keys or type values that ECMA-262 does not define.

Does this resolve the issue? Do folks have concerns with these answers? The decisions here are pretty fundamental, so I'm tagging this issue as "Stage 2".

I think it is beneficial to have common semantics and interpretation across conforming host platforms; just like the current draft spec defines for type: "json". My concern is a host platform may use a good type value that ECMA-262 has not yet defined, which could potentially prevent use of the good type value in a future specification. I consider the following to be good type values that might be defined in future specifications: image, audio, video, font, zip, archive, xml, html, css, stylesheet, wasm. I propose the spec reserves good type values, such that conforming hosts may not use them.

I have no idea if this is a stage 2 concern or not; please accept my apologies if this suggestion is inappropriate at this stage.


Note, this concern has been addressed, at least in my mind, by #63 (comment)

Although, TC39 might not enforce interpretation, it might be a good idea to write up a table of expected possible types per host environment with possible expected semantics just to get a better idea of the scope of things and to get everyone on the same page. This might also allow to spot where the proposal is currently lacking something we might not have thought about.