mathjax/MathJax-demos-node

[Question] Utility API for replacing latex expressions with generated content?

fireflysemantics opened this issue · 6 comments

Hi,

Per this question we obtain an array of ProtoItem instances.

I take the latex in each instance and generate the corresponding SVG. So I end up with another array of SVG, where each SVG rendering matches the corresponding MathItem instance.

I was wondering if MathJax has an API for merging the generated results into the original source string?

I wrote a quick one. I can contribute this / rewrite in Typescript if need be:

async function replaceLatexWithSvg(s) {
  var cursorIndex = 0;
  const protoItems = find.findMath([s]);
  const svgRenderings = []
  const templateStrings = []
  let document = ''
  
  await Promise.all(
    protoItems.map(async (protoItem) => {
      const latex = protoItem.math;
      const svg = await tex2svg(latex);
      svgRenderings.push(svg)
    })
  );

  protoItems.forEach((protoItem, i)=>{
    const templateString = s.substring(cursorIndex, protoItem.start.n)
    cursorIndex = protoItem.end.n
    templateStrings.push(templateString)
  })
  const templateString =s.substring(cursorIndex, s.length)
  templateStrings.push(templateString)


  svgRenderings.forEach((svg, i)=>{
    document += templateStrings[i] + svg
  })
  document += templateStrings[templateStrings.length -1]
  return document;
}
exports.replaceLatexWithSvg = replaceLatexWithSvg;

dpvc commented

I'm glad you were able to work out your needed functions. MathJax doesn't currently have such a function. Currently, it only has function for handling an actual HTML DOM tree.

The MathJax design does allow for other document types (we envisioned, for example, handling a Markdown document). The mechanism for this would be MathJax's Handler object, which would allow you to maintain a document that is in another format (say just a Markdown string). The document handler registers with MathJax, and then mathjax.document() could create a handler instance for that type of document. The handler would probably have format-specific subclasses of the MathDocument (and perhaps MathItem as well), and the MathDocument would specify the renderActions that are needed for that document class. The HTML document handler has actions to find the math in the strings obtained from the DOM (that is the FindTeX step from the earlier question), compile the TeX, getting the metric information about the surrounding fonts, typesetting into the output format, inserting the result into the DOM, updating the document stylesheets, adding the menu event handlers, and so on.

A Markdown handler would have similar calls, their actions would be different. For example, your functions above would replace the DOM-based updateDocument() action, and since you don't know the metic data from just the markdown string, the getMetrics() action would be replaced by a function that just returned generic metric information. You would not really be able to add the menu event handlers in this scenario (they rely on event listeners), so would either have to forgo the menu, or add the menu later when the page is turned into an HTML DOM.

An alternative would be to have the Markdown handler be a subclass of the HTML handler, that starts with the document as a Markdown string, has the find-math function locate the math in the Markdown, and either remove it or isolate is to that Markdown won't modify the math; then have a renderAction that processes the markdown into HTML, and then use the rest of the HTML render actions on that DOM.

In any case, that is the vision for how to do the kind of thing you are suggesting.

(we envisioned, for example, handling a Markdown document).

That's exactly what I'm doing :).

For example I may have a document like this:

---
title: MAD Forecast Error Measure Metric
summary: Mean Absolute Deviation forecast error matric.
author: Ole Ersoy
date: 4/25/2021
type: formula
tags: ["D", "MAD", "E"]
headerImage: raumgleiter-2651592_1920.png
---

The Mean Absolute Deviation of a data set is the average of the sum of absolute deviations.

$$MAD = \\frac{\\sum_{i=1}^n | x_i - \\bar{x} |} n$$

And I'll read that in and split off the markdown so I end up with it in a variable like md.

Then I run that string through the function, which replaces the latex with SVG, per your brilliant suggestion (Thanks again ... I'm just so happy about it).

If you think this would help others with the same markdown scenarios I'll gladly contribute back the package utility I ( We ) built.

This is the entire utility:

/*************************************************************************
 *
 *  Uses MathJax v3 to convert a TeX string to an SVG string.
 *
 * ----------------------------------------------------------------------
 *
 *  Copyright (c) 2020 The MathJax Consortium
 *
 *  Licensed under the Apache License, Version 2.0 (the "License");
 *  you may not use this file except in compliance with the License.
 *  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 *  Unless required by applicable law or agreed to in writing, software
 *  distributed under the License is distributed on an "AS IS" BASIS,
 *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 *  See the License for the specific language governing permissions and
 *  limitations under the License.
 */

//========================================================
//  The default TeX packages to use
//========================================================
const PACKAGES = "base, autoload, require, ams, newcommand";
//const latex = `MAD = \\frac{\\sum_{i=1}^n | x_i - \\bar{x} |} n`;

//
//  Minimal CSS needed for stand-alone image
//
const CSS = [
  "svg a{fill:blue;stroke:blue}",
  '[data-mml-node="merror"]>g{fill:red;stroke:red}',
  '[data-mml-node="merror"]>rect[data-background]{fill:yellow;stroke:none}',
  "[data-frame],[data-line]{stroke-width:70px;fill:none}",
  ".mjx-dashed{stroke-dasharray:140}",
  ".mjx-dotted{stroke-linecap:round;stroke-dasharray:0,140}",
  "use[data-c]{stroke-width:3px}",
].join("");

//
//  Get the command-line arguments
//
var argv = require("yargs")
  .demand(0)
  .strict()
  .usage('$0 [options] "math" > file.svg')
  .options({
    inline: {
      boolean: true,
      describe: "process as inline math",
    },
    em: {
      default: 16,
      describe: "em-size in pixels",
    },
    ex: {
      default: 8,
      describe: "ex-size in pixels",
    },
    width: {
      default: 80 * 16,
      describe: "width of container in pixels",
    },
    packages: {
      default: PACKAGES,
      describe:
        'the packages to use, e.g. "base, ams"; use "*" to represent the default packages, e.g, "*, bbox"',
    },
    styles: {
      boolean: true,
      default: true,
      describe: "include css styles for stand-alone image",
    },
    container: {
      boolean: true,
      describe: "include <mjx-container> element",
    },
    css: {
      boolean: true,
      describe: "output the required CSS rather than the HTML itself",
    },
    fontCache: {
      boolean: true,
      default: true,
      describe: "whether to use a local font cache or not",
    },
    assistiveMml: {
      boolean: true,
      default: false,
      describe: "whether to include assistive MathML output",
    },
    dist: {
      boolean: true,
      default: true,
      describe:
        "true to use webpacked version, false to use MathJax source files",
    },
  }).argv;

//
// Configure MathJax
//
MathJax = {
  options: {
    enableAssistiveMml: argv.assistiveMml,
  },
  loader: {
    paths: { mathjax: "mathjax-full/es5" },
    source: argv.dist
      ? {}
      : require("mathjax-full/components/src/source.js").source,
    require: require,
    load: ["adaptors/liteDOM"],
  },
  tex: {
    packages: argv.packages.replace("*", PACKAGES).split(/\s*,\s*/),
  },
  svg: {
    fontCache: argv.fontCache ? "local" : "none",
  },
  startup: {
    typeset: false,
  },
};

//
//  Load the MathJax startup module
//
require("mathjax-full/" +
  (argv.dist ? "es5" : "components/src/tex-svg") +
  "/tex-svg.js");

const { protoItem } = require("mathjax-full/js/core/MathItem");
const { FindTeX } = require("mathjax-full/js/input/tex/FindTeX.js");
exports.FindTex = FindTeX;

const find = new FindTeX({
  //
  // These are the default options, so not really needed, but are used as an example
  //
  inlineMath: [["\\(", "\\)"]],
  displayMath: [
    ["$$", "$$"],
    ["\\[", "\\]"],
  ],
  processEscapes: true,
  processEnvironments: true,
  processRefs: true,
});

exports.find = find;

function tex2svg(latex) {
  const result = new Promise((resolve, reject) => {
    //
    //  Wait for MathJax to start up, and then typeset the math
    //
    MathJax.startup.promise
      .then(() => {
        MathJax.tex2svgPromise(latex, {
          display: !argv.inline,
          em: argv.em,
          ex: argv.ex,
          containerWidth: argv.width,
        }).then((node) => {
          const adaptor = MathJax.startup.adaptor;
          const svg = adaptor.innerHTML(node);
          resolve(svg.replace(/<defs>/, `<defs><style>${CSS}</style>`));
        });
      })
      .catch((err) => reject(err));
  });
  return result;
}

exports.tex2svg = tex2svg;

async function replaceLatexWithSvg(s) {
  var cursorIndex = 0;
  const protoItems = find.findMath([s]);
  const svgRenderings = []
  const templateStrings = []
  let document = ''
  
  await Promise.all(
    protoItems.map(async (protoItem) => {
      const latex = protoItem.math;
      const svg = await tex2svg(latex);
      svgRenderings.push(svg)
    })
  );

  protoItems.forEach((protoItem, i)=>{
    const templateString = s.substring(cursorIndex, protoItem.start.n)
    cursorIndex = protoItem.end.n
    templateStrings.push(templateString)
  })
  const templateString =s.substring(cursorIndex, s.length)
  templateStrings.push(templateString)


  svgRenderings.forEach((svg, i)=>{
    document += templateStrings[i] + svg
  })
  document += templateStrings[templateStrings.length -1]
  return document;
}
exports.replaceLatexWithSvg = replaceLatexWithSvg;

function replace(origin, startIndex, endIndex, svg) {
  return origin.substring(0, startIndex) + svg + origin.substring(endIndex);
}

The package depends on:

  "dependencies": {
    "esm": "^3.2.25",
    "mathjax-full": "^3.1.4",
    "yargs": "^16.2.0"
  },

I have that as a separate package locally. That way I can just npm i @fireflysemantics/tex2svg and use the functions in various markdown projects. If it were part of MathJax though, then someone could just import replaceLatexWithSvg directly and have support for converting markdown latex expressions.

Just a note ... One idea that could have also made this work is to use a Web Component / Custom Element to render the math. The reason I needed either this or that is because the content is selected dynamically. This is the app for browsing content:

http://forecasting-help.fireflysemantics.com/

So since Angular renders the content inside the component, MathJax's web script does not get access to it, as with static web pages.

So I still think prerendering for this is best, but a Custom Element / Web Component could have worked as well. The fs-link-preview custom element is an example of this approach:

https://www.npmjs.com/package/@fireflysemantics/fs-link-preview

It's built using lit element.