Inline Tex inside image annotation

Question

Inline Tex inside image annotation

Closed this issue 5 years ago · 4 comments

I have some Tex that I'd like to put inside an image annotation, like so

![Hello world $3x + 2$](foo.jpg)

I dug down a bit and it seems that handle_match_inline is correctly substituting the script tag, but this is getting stripped off in a later stage.

def handle_match_inline(m):
    node = etree.Element('script')
    node.set('type', self._get_content_type())
    node.text = AtomicString(m.group(3))
    result = _wrap_node(node, ''.join(m.group(2, 3, 4)), 'span')
    print(etree.tostring(result))
    return result

In [22]: md.convert('![$3x + 2$](a.jpg)')                                                 
b'<script type="math/tex">3x + 2</script>'
Out[22]: '<p><img alt="3x + 2" src="a.jpg" /></p>'

Answer 1 · 2019-07-25T09:07:32.000Z

How do you expect it to work? The alt attribute is a plain string, and obviously you cannot have HTML markup in it.

Answer 2 · 2019-07-25T14:40:14.000Z

I should have given more detail. I wrote a TreeProcessor extension (markdown-captions) which puts the markdown image text inside a <figcaption> where HTML markup is valid:

[ins] In [15]: md = markdown.Markdown( 
          ...:     extensions=['mdx_math', 'markdown_captions'], 
          ...:     extension_configs = { 
          ...:         'mdx_math': { 
          ...:             'enable_dollar_delimiter': True 
          ...:         } 
          ...:     } 
          ...: )                                                                                

[nav] In [16]: md.convert('![$3x + 2$](a.jpg)')                                                 
Out[16]: '<p><figure><img src="a.jpg" /><figcaption>3x + 2</figcaption></figure></p>'

So my question is why/where is the <script> tag getting stripped and how might I fix this?

Answer 3 · 2019-07-25T17:57:56.000Z

Thanks, it makes more sense now.

The ImageInlineProcessor calls unescape() here:
https://github.com/Python-Markdown/markdown/blob/3.1.1/markdown/inlinepatterns.py#L614

And unescape() removes all markup and leaves only text content:
https://github.com/Python-Markdown/markdown/blob/3.1.1/markdown/inlinepatterns.py#L240

Answer 4 · 2019-07-30T19:46:58.000Z

Thanks. I switched my extension to a LinkInlineProcessor and set the priority just above the ImageInlineProcessor and it works now.