goessner/markdown-it-texmath

Pandoc support

abchugh opened this issue · 8 comments

Firstly, thank you so much for this awesome plugin. It is what powers my large parts of my new non-profit.

This feature request for making this plugin compatible with pandoc (the best/most-widely known open-source document converter).

A large portion of our project is to convert latex documents into markdown so that they can be used on our website (eg see bit.ly/sphz-art ). To do this, we use pandoc to convert latex to markdown and use the markdown-it-texmath for rendering.

However, there are a couple of major issues when rendering pandoc output:

  1. Inline block latex - pandoc considers it okay to have block quotes inline (doesn't need to be in a new paragraph) and renders them correctly if requested
  2. Tex overflowing over multiple lines

For example both these issues show up in the following text generated by pandoc:

The upper bound is $${\ell({x^n})} \leq |n| \, {\ell({x})}$$ for all $n \in 
{\mathbb{Z}}$. 

I would be really glad if you could provide support for this.

@abhishekchugh: Your request seems to be quite similar to kramdown syntax, which markdow-it-texmath already supports.

So I added pandoc syntax as described by you as a (beta) feature to version 0.6.5.

Please have a closer look into that pandoc usage and test it.

Thanks

Closing ... please reopen in case of pandoc usage problems.

@goessner : Thank you so much for working on this request. The new code identified '$$-quoted' latex strings correctly. There are couple of minor issues that can be easily resolved:

  1. The new inline block (with '$$' as tag ) should be 'math_block' instead of 'math_inline'
  2. I didn't see support for Tex overflowing into multiple lines. As a hack, I removed the '\r\n' rex of 'math_inline' like so

'rex: /$(\S[^$]*?[^\s\\]{1}?)$/gy,'

I don't know if introduces any other bugs, but I didn't find any.

Also, you may want to fix this for all delimiter types (not just 'pandoc') because this makes the plugin inconsistent with pandoc's general philosophy of ignoring single-line breaks.

Expected input and output example:
Screenshot from 2020-04-06 10-16-56

Unfortunately it only works by accident. Having a closer look into the pandoc inline rules shows, that only the $...$ rule is activated. So ...

  • I need to investigate, why two $s are matched.
  • I also need to investigate if and when those \n\r's are needed.

I forgot, what exactly is responsible, why $...$ and $$...$$ cannot coexist in inline mode.
For investigating this effect, I need much more time, which i do not have at current.

BTW: Forcing an inline rule $$...$$ to result in a display mode formula then, would be no problem.

I would keep pandoc delimiters as beta inplementation in the list. You only need to decide, if ...

  • keep $$...$$ in inline mode, which would result in kramdown mode.
  • keep $...$ in inline mode, which would result in dollar mode.

I would like to eliminate those \n\r's then, as you are suggesting, for testing.

Taking the second alternative in combination with a simple preprocessing from your side, which is changing $$...$$ to \n\n$$...$$\n\n would be an interim solution and should be no problem.

I need to publish Version 0.6.6 in time, so ... what do you think?

Stefan, I copy the plugin code in this repository and modify slightly (coz I need to modify the rendering so that I create a custom Angular plugin). So I have created custom delimiters which I needed (pasted at the end of this comment). If I find any issues, I will use your pre-processing suggestion.

Having said that, pandoc support will make this more plugin more widely used since pandoc is extremely popular and multi-line support feels quite important when you are working with large formulas.

In any case, I am grateful for this work - I just can't wrap my head around how markdown-it plugins work and the authors refuse to write better documentation. Feel free to come back to this if you think this is important and you have time to spend on this.

custom_pandoc: {
        inline: [
            {   name: 'math_block',
                rex: /\${2}([^$\r\n]*?)\${2}/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$$'
            },
            {   name: 'math_inline',
                rex: /\$(\S[^$]*?[^\s\\]{1}?)\$/gy,
                // rex: /\$(\S[^$\r\n]*?[^\s\\]{1}?)\$/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$',
                pre: texmath.$_pre,
                post: texmath.$_post
            },
            {   name: 'math_single',
                rex: /\$([^$\s\\]{1}?)\$/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$',
                pre: texmath.$_pre,
                post: texmath.$_post
            }
        ],
        block: [
            {   name: 'math_block_eqno',
                rex: /\${2}([^$]*?)\${2}\s*?\(([^)$\r\n]+?)\)/gmy,
                tmpl: '<section class="eqno"><eqn>$1</eqn><span>($2)</span></section>',
                tag: '$$'
            },
            {   name: 'math_block',
                rex: /\${2}([^$]*?)\${2}/gmy,
                tmpl: '<section><eqn>$1</eqn></section>',
                tag: '$$'
            }
        ]
    },

ok ... simply change the first inline rule from

            {   name: 'math_block',
                rex: /\${2}([^$\r\n]*?)\${2}/gy,
                tmpl: '<eq>$1</eq>',
                tag: '$$'
            },

in order to get a display mode formula to

            {   name: 'math_block',
                rex: /\${2}([^$\r\n]*?)\${2}/gy,
                tmpl: '<section><eqn>$1</eqn></section>',
                tag: '$$'
            },

and you will see, that it will be not invoked due to the match of the second ...

good luck

@abchugh: I finally found some time and successfully implemented your requested features ... at least I think so. Can you please test it in your environment?

At the same time I removed the texmath.rules['pandoc'] entry. I also like that behavior a lot, so I enhaced the texmath.rules['dollars'] rules accordingly. If there are reasons to keep the pandoc namespace, you might simply specify

texmath.rules['pandoc'] = texmath.rules['dollars'];

somewhere.

thanks

@goessner I finally got some time to update the code with the latest updates. The inline block-mode worked quite well. Did you add support for inline tex overflowing into multi-line lines? I couldn't get that working. Could be a mistake at my end too.