How to use extension tokens for parsing?
Closed this issue · 4 comments
I'd like to parse something like
Some math $a+b=c$
```math
x^2 = -1
```
into Markdown with explicit math blocks. I see that mistletoe has the Math extension token (https://mistletoe-ebp.readthedocs.io/en/latest/api/extension_tokens.html), but I can't for life of me figure out how to use it. The documentation (e.g., here) seems outdated since the import
s in the examples don't even work anymore.
MWE:
import mistletoe
from mistletoe.markdown_renderer import MarkdownRenderer
doc = mistletoe.Document("a ~~st~~ $b+c$")
print()
print(doc)
print()
print(doc.children)
print()
print(doc.children[0].children)
print()
with MarkdownRenderer() as mdr:
print(repr(mdr.render(doc)))
<mistletoe.block_token.Document with 1 child at 0x7f51dd939a90>
[<mistletoe.block_token.Paragraph with 3 children at 0x7f51dda2a090>]
[
<mistletoe.span_token.RawText content='a ' at 0x7f51dd72dd10>,
<mistletoe.span_token.Strikethrough with 1 child at 0x7f51dd72ddd0>,
<mistletoe.span_token.RawText content=' $b+c$' at 0x7f51dd72de50>
]
'a ~~st~~ $b+c$\n'
Any hints?
Hi @nschloe, I'm not sure if I get your question. In mistletoe (not mistletoe-ebp which is/was a fork which I don't know deeply), you can use e.g. MathJaxRenderer if you are interested in rendering HTML together with the MathJax JS library.
Related mistletoe documentation:
- https://github.com/miyuchina/mistletoe/#usage
- https://github.com/miyuchina/mistletoe/blob/master/dev-guide.md#creating-a-custom-token-and-renderer - more in-depth view on custom tokens and renderers
not mistletoe-ebp which is/was a fork which I don't know deeply)
Ah, hadn't realized they were different. (When googling I always get to their documentation.)
if you are interested in rendering HTML
My interest in in parsing. I'd like to parse, change some things, and render back to Markdown. For this to work, I need Strikethrough (~~...~~
), math, tables, etc. parsed correctly.
OK, so what about the following code? Essentially, you need to pass additional token class(es) to the parsing process as well as to define corresponding render_...
method(s) - you do this by defining your own renderer class:
from typing import Iterable
import mistletoe
from mistletoe import block_token
from mistletoe.latex_token import Math
from mistletoe.markdown_renderer import Fragment, MarkdownRenderer
class MyMarkdownRenderer(MarkdownRenderer):
def __init__(self, **kwargs):
"""
Args:
**kwargs: additional parameters to be passed to the ancestors'
constructors.
"""
super().__init__(Math, **kwargs)
def render_math(self, token) -> Iterable[Fragment]:
yield Fragment(token.content + " (Math rules :))")
# @override
def render_fenced_code_block(
self, token: block_token.BlockCode, max_line_length: int
) -> Iterable[str]:
indentation = " " * token.indentation
yield indentation + token.delimiter + token.info_string + (
" (Math rules :))" if token.info_string == "math" else ""
)
yield from self.prefix_lines(token.content[:-1].split("\n"), indentation)
yield indentation + token.delimiter
print(
MyMarkdownRenderer().render(
mistletoe.Document(
"""
a paragraph with math: $ 2^3 $
$$ c^2 = a^2 + b^2 $$
```math
x^2 = -1
```
"""
)
)
)
This outputs:
a paragraph with math: $ 2^3 $ (Math rules :))
$$ c^2 = a^2 + b^2 $$ (Math rules :))
```math (Math rules :))
x^2 = -1
```
Closing this as answered, feel free to "reopen" by commenting.