wasm support
Opened this issue · 23 comments
sorry if this is the wrong place but i literally could not find a single place online mentioning epub and wasm together
will future epub specifications include wasm support for scripting?
correction, it is mentioned here https://www.w3.org/2020/12/epub3-wg-charter.html
- Guides on improving alignment with the Open Web Platform, such as:
- Which Web APIs that are suitable for EPUB usage.should be available in EPUB?
- Usage of [WASM](https://www.w3.org/TR/wasm-core-1/) in EPUB content.
- Relationship of CORS requirement, externally hosted resources, and EPUB content.
As far as I can see, the current standards talk about scripting with very general terms (https://www.w3.org/TR/epub-rs-33/#sec-scripted-content). The current standards are pretty silent as for the language of scripts. Of course, today it is JavaScript but who knows whether something comes up tomorrow.
What this boils down to, I believe, is that this is entirely a question of individual EPUB Reading System implementation. Some of them do not accept any scripting, some of them do it partially, and I would expect some may accept WASM already today. This does not require any update on the EPUB standards themselves.
What this boils down to, I believe, is that this is entirely a question of individual EPUB Reading System implementation.
Not entirely, I wouldn't think. To the best of my knowledge, you'll always have an externally compiled .wasm
file to import and it's not currently a core media type. Only data blocks are exempt. That'll make it a nuisance to author absent an update to the CMT list.
I was just double-checking and we do have an exemption for script data files, but that's not quite what this is. The files might sneak through epubcheck that way since it doesn't check code fetches or their purpose, but it wouldn't be in the spirit of the specification.
Ah... I did not think of that. I wonder whether it is worth adding this to the discussions at TPAC. @wareid w.d.y.t.?
Should we raise an issue separately with the proposal of adding .wasm
to the list of core media types, or this one suffices? I tend to believe it is better to have it explicitly, but it may be an overkill.
Alternatively, we can raise an explicit PR that does that, referring to this issue.
FWIW: the official media type is application/wasm
, the registration is https://www.iana.org/assignments/media-types/application/wasm, see also https://www.w3.org/TR/wasm-core-1/#conventions
I think that case is covered in https://www.w3.org/TR/epub-33/#sec-exempt-resources
That's what has me a bit confused. If you fetched another js script and wrote a script tag, that exception wouldn't apply. But in this case the WebAssembly api would be called on the fetched compiled code, so nothing is written into the html. The idea for that exemption was data sets being fed into a script, but maybe compiled code fits in, too.
It might help to clarify that the primary exemption is for any kind of resource used by scripts, and cite both data sets and web assembly code as examples, rather than focus so much on data sets.
@bduga, that section includes
This exemption allows EPUB creators to include resources in the EPUB container that are not for use by EPUB reading systems.
One reading of this is that wasm files are not exempt resources, because they are used, albeit indirectly, by the reading system. At the minimum, this ambiguity must be clarified.
Don't we have the same issue with javascript modules? They are only referred to indirectly from your the script referred to (or contained by) the <script>
element, but you may still want them to travel with your epub.
The answer is probably that javascript modules are covered by the fact that they are "protected" by the fact of being core media types, but it would be more "natural" to consider wasm files the same way.
@bduga, that section includes
This exemption allows EPUB creators to include resources in the EPUB container that are not for use by EPUB reading systems.
One reading of this is that wasm files are not exempt resources, because they are used, albeit indirectly, by the reading system. At the minimum, this ambiguity must be clarified.
But the next sentence is, emphasis mine:
The primary case for this exemption is to allow data files to travel with an EPUB publication, whether for scripts to use in their constituent EPUB content documents or for external applications to use
That is the exact use here - wasm are data files used by javascript in EPUB content documents, and seems to confirm that files used by JS are not considered used by the Reading System. Though, I agree this wording could use some tweaking.
The answer is probably that javascript modules are covered by the fact that they are "protected" by the fact of being core media types, but it would be more "natural" to consider wasm files the same way.
I agree that we should (probably) treat them the same way, but perhaps the correct thing to do is stop dictating what script types are allowed and just make anything that is referenced by the script tag as exempt resources. Do we really want to be in the business of adding media types every time browsers start adopting new scripting languages?
I agree that we should (probably) treat them the same way, but perhaps the correct thing to do is stop dictating what script types are allowed and just make anything that is referenced by the script tag as exempt resources. Do we really want to be in the business of adding media types every time browsers start adopting new scripting languages?
I fully agree. Who knows what the community may come up with in the coming years? Direct usage of TypeScript instead of javascript? Or Go/Swift/Rust/whatever?
ust make anything that is referenced by the script tag
referenced directly or indirectly right? Which may open the floodgates...
Yep. But the floodgates I was referring to are different. As you said, the reason for exemption was for "passive" data that scripts may use. Adding wasm to the exemption means adding a, fundamentally, executable content.
But anyway, I do not think we really contradict one another. Clearly, we may have to think through again how scripts and data should be/could be handled. I vaguely remember that we had some discussion back then in the WG about handling exemption data, and I believe there were voices around who did not want to put any type of fences around those data. The coming up of wasm shows that whatever explicit fences we draw up, eventually it may become too restrictive...
Or some type of portable document format that could be displayed in a viewer inside of an EPUB content document with no fallback required.
I think you're going too far with this interpretation. The input can be a non-CMT, but the result of using it has to be a supported format. Per 6.3.2.5:
EPUB creators MUST ensure that scripts only generate core media type resources or fragments thereof.
I tend to agree with Ivan that wasm and js modules weren't what we were considering with the exemption, but probably should have been. Why does the script code that calls the module have to be a CMT because it's referenced from a script
element but the additional scripting resources it relies on don't have to be? It's a bit arbitrary.
If we make it explicit that resources referenced from a script
tag are exempt, as well as any resources that a script can import/fetch/whatever, then the spec will be a lot better for it.
I think you're going too far with this interpretation. The input can be a non-CMT, but the result of using it has to be a supported format. Per 6.3.2.5:
EPUB creators MUST ensure that scripts only generate core media type resources or fragments thereof.
I don't think I am. There is nothing in that statement to say I can't have a script that acts as a renderer of any content type, it just can't produce them (and presumably directly load). But any type of file can be loaded and displayed via JS with no fallback required, so long as it is only referenced by the script (or scripts).
If we make it explicit that resources referenced from a
script
tag are exempt, as well as any resources that a script can import/fetch/whatever, then the spec will be a lot better for it.
I think we are all in agreement here. I don't know how big an issue this is today, though. The original question to us was when will we support wasm, and the answer is now so long as it is loaded via javascript. It would be nice to allow direct reference of non-JS resources via the script tag, but I don't know how common that is (I think it requires a polyfill for wasm).
But any type of file can be loaded and displayed via JS
Could you give a specific example of how this would work? A script presumably has to write some html to render the content and that has to be a core media type per the definition, or provide a fallback. Even if you were to use canvas
for display you're still converting whatever your source was into the image to paint on it. You're not rendering the original foreign resource.
For example, PDF.js doesn't take PDF and turn it into an HTML document and load it, it directly renders PDF into a canvas element.
Right, that's what I mentioned above. That's not what I'd call rendering the foreign resource without any fallback, though. That's turning it into something native, which is what the scripting requirement enforces. You draw to the canvas element which as a part of html is a CMT.
What it sounded to me like you were saying is that so long as you used some scripting and called what you made a viewer then you bypassed the CMT requirements. So inject an iframe with some buttons and call it a PDF viewer and you can load the PDF straight into the iframe and bypass otherwise needing a fallback.
What would this do to the Accessibility DOM the screen readers access?
I don't know the inner working of pdf.js, but from what I've read they do preserve a tree and any aria roles they can if the pdf is tagged. No idea how that turns out in screen readers, though.