Transform HTML entities to unicode
johnridesabike opened this issue · 9 comments
The Babel JSX compiler converts HTML entities into the unicode characters they represent. In ReScript, we have to use the actual unicode directly. For some characters, like this isn't always convenient.
Would it be possible for the React PPX to do the same transformation that Babel does? See their entity map for reference: https://github.com/babel/babel/blob/b3e2bcda73dea7d68b4c82bfabb92acb11b1ed90/packages/babel-parser/src/plugins/jsx/xhtml.js
Babel example
Input
let f = () => <div> </div>;output
let f = () => /*#__PURE__*/React.createElement("div", null, "\xA0");ReScript example
Input
let f = () => <div> {React.string(" ")} </div>Output
function f(param) {
return React.createElement("div", undefined, " ");
}Note that " will render as-is, not as an actual nonbreaking space character.
Will need to investigate what's possible!
Even if it isn't possible to transform strings via PPX, another possibility would be to include the entities as values like let nbsp = `\xa0`, which could be used like <div> {React.string(`foo${React.nbsp}bar`)} </div> .
Still a bit clunky, but it seems preferable to having to write the unicode by hand.
@johnridesabike I just asked Ricky and Maxim if it makes sense to first-class this into the syntax / ppx, and its hard to say if this particular feature would be justifiable from a complexity POV... e.g. if it mixes well with compiler internal escaping / if it makes the syntax logic way more complicated than necessary.
So, would it be an option to create some independent HtmlEntity module (maybe within an example/ directory), that implements all the unicode characters mentioned above, and add an extra section to the rescript-react docs that points to that particular file for copy / paste, so ppl can quickly use it in the manner you just described with {foo${HtmlEntity.nbsp}bar}? With @inline, this would actually be zero-cost even.
I personally created my own entity bindings in user-space, but having more guidance on the topic in the docs would be great.
That seems like a reasonable compromise. IMO the fact that there is no documentation on the topic, combined with people being used to HTML entities "just working" in Babel JSX, is the main issue. Even a unicode table in the docs that people can copy/paste from would be useful.
For the hypothetical HtmlEntity module, are you thinking of putting that in an example/ directory within the rescript-react repository, or make a separate repo (or even just a gist) that people can use to copy/paste from?
For anyone interested in copying entities into their own project, I just pushed a commit with an HtmlEntity module here: https://github.com/johnridesabike/coronate/blob/e42688a04f34eccf0003b428e0f13054dd80a9b2/src/HtmlEntities.res
Thanks Jon!
We could put your file in an "extra" folder in this project and link to it?
Yes! It was very easy to adapt from the Babel file, so feel free to link, copy, or do whatever with it.
@ryyppy Actually why put it into an "extra folder" and not make it a direct part of @rescript/react? Or maybe even the compiler (Dom.HtmlEntity or whatever)?
/cc @cristianoc
I don't see a problem with including it in Dom or Js or something similar. There's nothing React-specific in the file I made, even though it's based on the Babel-React source. It could be used with any kind of JavaScript output.