mathjax/MathJax-demos-node

SVG output not showing in XHTML

netw0rkf10w opened this issue · 21 comments

I tried using direct/tex2svg-page to output XHTML and found that the images are not showing.

To ensure syntax correctness, I replaced the last line:

console.log(adaptor.outerHTML(adaptor.root(html.document)));

with

console.log(adaptor.serializeXML(adaptor.root(html.document)));

An example of the output SVG is as follows:

<mjx-container class="MathJax" jax="SVG">
        <svg style="vertical-align: 0;" xmlns="http://www.w3.org/2000/svg" width="2.653ex" height="1.932ex" role="img" focusable="false" viewBox="0 -853.7 1172.7 853.7" xmlns:xlink="http://www.w3.org/1999/xlink">
            <g stroke="currentColor" fill="currentColor" stroke-width="0" transform="scale(1,-1)">
                <g data-mml-node="math">
                    <g data-mml-node="msup">
                        <g data-mml-node="TeXAtom" data-mjx-texclass="ORD">
                            <g data-mml-node="TeXAtom" data-mjx-texclass="ORD">
                                <g data-mml-node="mi">
                                    <use data-c="211D" xlink:href="#MJX-TEX-D-211D"/>
                                </g>
                            </g>
                        </g>
                        <g data-mml-node="TeXAtom" transform="translate(755,363) scale(0.707)" data-mjx-texclass="ORD">
                            <g data-mml-node="mi">
                                <use data-c="1D451" xlink:href="#MJX-TEX-I-1D451"/>
                            </g>
                        </g>
                    </g>
                </g>
            </g>
        </svg>
        </mjx-container>

Could you please help? Thank you very much in advance!

dpvc commented

I suspect that the xlink:href references aren't working properly in XHTML. Try adding fontCache: 'local' to the new SVG() options. If that doesn't work, try fontCache: 'none'. These href attribute are sometimes finicky.

dpvc commented

Actually, you don't need to set it in the file, you can use the ---fontCache option to specify which to use. So try --fontCache local and see if that works.

@dpvc Indeed, I also had realized that disabling the global caching makes it work but I forgot to mention this in the question. The issue without global cache is that it makes the file size significantly larger (about 5 times for one of my use cases).

I think I found the issue: https://stackoverflow.com/a/77891891/2131200. It looks like adaptor.serializeXML "forgot" to add the namespace to some of the SVGs. Can this be considered a bug?

dpvc commented

Thanks for the link to stackoverflow that points out the issue. Yes, this is a bug, and I have made a PR to fix it. In the meantime, you can add

Object.assign(svg, {
  _pageElements: svg.pageElements,
  pageElements(html) {
    const cache = this._pageElements(html);
    if (cache) {
      this.adaptor.setAttribute(cache, 'xmlns', 'http://www.w3.org/2000/svg');
    }
    return cache;
  }
});

to the tex2svg-page script just after the line

const html = mathjax.document(htmlfile, {InputJax: tex, OutputJax: svg});

to patch the SVG output.

@dpvc Works like a charm. Thank you very much!

There are still two issues with adaptor.serializeXML:

  1. It doesn't add the namespace to <html>.
  2. As MathJax v4 stores the raw LaTeX data in the SVG (e.g., data-latex="a < b + c"), when there are > or < in the formulas, then the output XML has syntax errors: Unescaped '<' not allowed in attributes values.

For your convenience, here's an HTML for testing:

<html>
 <head>
  <meta content="text/html; charset=UTF-8" http-equiv="Content-Type"/>
  <title>Test</title>
 </head>
 <body>
    <span>This is some text followed by an equation: $a < b + c$.</span>
 </body>
</html>
dpvc commented

OK, I've made a PR to resolve these two issues. Here is a patch that you can use for now:

First, add

import {LiteParser} from 'mathjax-full/js/adaptors/lite/Parser.js';

to the imports section. Then add

Object.assign(LiteParser.prototype, {
  serialize(adaptor, node, xml = false) {
    const SELF_CLOSING = this.constructor.SELF_CLOSING;
    const tag = adaptor.kind(node);
    const attributes = this.allAttributes(adaptor, node, xml).map(
      (x) => x.name + '="' + this.protectAttribute(x.value, xml) + '"'
    ).join(' ');
    const content = this.serializeInner(adaptor, node, xml);
    const html =
      '<' + tag + (attributes ? ' ' + attributes : '')
          + ((!xml || content) && !SELF_CLOSING[tag] ? `>${content}</${tag}>` : xml ? '/>' : '>');
    return html;
  },

  allAttributes(adaptor, node, xml) {
    let attributes = adaptor.allAttributes(node);
    const kind = adaptor.kind(node);
    if (!xml || (kind !== 'svg' && kind !== 'math' && kind !== 'html')) {
      return attributes;
    }
    for (const {name} of attributes) {
      if (name === 'xmlns') {
        return attributes;
      }
    }
    attributes.push({
      name: 'xmlns',
      value: ({
        svg:  'http://www.w3.org/2000/svg',
        math: 'http://www.w3.org/1998/Math/MathML',
        html: 'http://www.w3.org/1999/xhtml'
      })[kind]
    });
    return attributes;
  },

  _protectAttribute: LiteParser.prototype.protectAttribute,
  protectAttribute(text, xml) {
    text = this._protectAttribute(text, xml);
    if (xml) {
      text = text.replace(/</g, '&lt;').replace(/>/g, '&gt;');
    }
    return text;
  }
});

instantiating the LiteAdaptor (const adaptor = liteAdaptor({fontSize: argv.em});).

This is a bit long, but the allAttributes method is new, and there was no easy way to call the original serialize()` method and still make the needed fixes.

dpvc commented

PS, it also adds missing xmlns attribute to svg nodes, so the previous patch should no longer be needed if you use this one, though it is OK to use both.

@dpvc It worked! Thanks a lot!
On some files though I obtained Error: Cannot find module 'mathjax-modern-font/js/output/fonts/mathjax-modern/svg/dynamic/calligraphic', but this is probably an unrelated issue. I do not see any js folder in my mathjax-modern-font (4.0.0-beta.4), is this normal? Thanks.

dpvc commented

Is this with mathjax-full@4.0.0-beta.4? It looks like a reference from an earlier version. But it might be that there is a path that hasn't been updated somewhere in the font configuration code in beta.4. In any case, if you are at beta.4 with mathjax-full, there is a configuration parameter that can be set for the path to the dynamic directory.

Yes that was with mathjax-full@4.0.0-beta.4. Here's my package.json:

{
  "name": "MathJax-demos-node",
  "version": "4.0.0",
  "description": "Demos using MathJax v4 in node",
  "dependencies": {
    "esm": "^3.2.25",
    "mathjax-full": "4.0.0-beta.4",
    "mathjax-modern-font": "^4.0.0-beta.4",
    "yargs": "^17.7.2"
  },
  "devDependencies": {
    "@babel/core": "^7.14.6",
    "@babel/preset-env": "^7.14.5",
    "babel-loader": "^8.2.2",
    "terser-webpack-plugin": "5.3.0",
    "webpack": "5.88.2",
    "webpack-cli": "^5.1.1"
  },
  "repository": {
    "type": "git",
    "url": "https://github.com/mathjax/MathJax-demos-node/"
  },
  "keywords": [
    "MathJax",
    "examples",
    "nodejs"
  ],
  "license": "Apache-2.0"
}

I tried removing node_modules and npm install again but that didn't help.

dpvc commented

Can you provide your MathJax configuration? I'm not able to find a reference that should lead to that URL, so I'm wondering if there is something in the configuration leading to that.

Also, can you give the traceback from the error message?

@dpvc As you suggested earlier, manually setting the path to the dynamic directory did help:

const svg = new SVG({
    fontCache: argv.fontCache,
    exFactor: argv.ex / argv.em,
    dynamicPrefix: 'mathjax-modern-font/cjs/svg/dynamic'
  });

Thanks!

Removing the line dynamicPrefix: ..., I obtained the following traceback:

Error: Cannot find module '/Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/output/svg/fonts/svg/dynamic/calligraphic'
Require stack:
- /Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/util/asyncLoad/node.js
- /Users/name/projects/epubs/mathjax/v4/direct/tex2svg-page-epub
    at Module._resolveFilename (node:internal/modules/cjs/loader:1149:15)
    at Module._load (node:internal/modules/cjs/loader:990:27)
    at Module.require (node:internal/modules/cjs/loader:1237:19)
    at require (node:internal/modules/helpers:176:18)
    at mathjax_js_1.mathjax.asyncLoad (/Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/util/asyncLoad/node.js:31:16)
    at /Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/util/AsyncLoad.js:10:43
    at new Promise (<anonymous>)
    at asyncLoad (/Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/util/AsyncLoad.js:9:12)
    at MathJaxModernFont.<anonymous> (/Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/output/common/FontData.js:477:68)
    at step (/Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/output/common/FontData.js:44:23) {
  code: 'MODULE_NOT_FOUND',
  requireStack: [
    '/Users/name/projects/epubs/mathjax/v4/node_modules/mathjax-full/cjs/util/asyncLoad/node.js',
    '/Users/name/projects/epubs/mathjax/v4/direct/tex2svg-page-epub'
  ]
}

Sorry please hold on, I think there is some issue with the header of my script, all the import paths start with 'mathjax-full/js/...', which doesn't seem to be good. I'm checking again...

Hmm... Changing all 'mathjax-full/js/...'to 'mathjax-full/cjs/...' didn't make a difference, so the above traceback was good.
What I don't understand is that my script even worked with 'mathjax-full/js/...', which isn't the correct path...

dpvc commented

The MathJax package.json is set up to map js to cjs when you use require() to load it, and to mjs when you use import. That way you don't have to worry about the cjs/mjs difference.

Thanks for the stack trace. The dynamicPrefix was the setting I was going to suggest as well. I have things worked out pretty well for use with components, but haven't done as much checking with the direct modules. The default dynamicPrefix probably needs to be adjusted. I'll put it on the list of things to do.

dpvc commented

PS, I notice that the path in the error in your message here is different than the one you gave earlier. Did you change anything else that might account for that?

PS, I notice that the path in the error in your message here is different than the one you gave earlier. Did you change anything else that might account for that?

Sorry, the path in that previous message was because I set dynamicPrefix to that wrong path without realizing it (I had copied the code from one of your post in this repo, without changing the path). The last traceback was obtained when there is no dynamicPrefix specified.

dpvc commented

Was that other path an alpha.1 path? Just curious, as that is would correspond to the wrong path in the error message.

I don't know which version exactly but definitely from a previous version. It took it from your answer here: #55 (comment)
Thank you!

dpvc commented

OK, that explains where the js/output/fonts path was coming from, which was what was confusing me. The directory structure in the beta versions have changed, and that URL is no longer the correct one. Sorry for the confusion there; it was correct at the time the comment was made, but is not correct for beta.4. Your current path is the correct one for that.

Thanks for the clarification! And sorry for the confusion as well, I should have specified that I had copied your code :p