Relax disallowing multiple words in `code block` / `div` first line?
chrisjsewell opened this issue ยท 12 comments
Currently
```name
content
```
:::name
content
:::
goes to
code_block text="content\n" lang="name"
div class="name"
para
str text="content"
but
```name a
content
```
:::name a
content
:::
goes to
para
verbatim text="name a\ncontent\n"
para
str text=":::name a"
soft_break
str text="content"
soft_break
str text=":::"
This feels a little unintuitive to me. Is there a strong reason why this has to be the case?
Could the "additional" first-line content not be stored on the AST nodes?
It would then not be used in the standard HTML renderer,
but could be used by macros
Let's see what GFM does with it:
``` python more
hi
```
becomes
hi
and there is no trace of more
in the rendered HTML. So this behavior is implementing a kind of standard. But you're right that we could, in principle, treat the rest as additional attributes. But how? Split by spaces and make them classes? What if there is punctuation not normally allowed in classes?
Let's see what GFM does with it
Commonmark stores the entire first line as info
: https://spec.commonmark.org/dingus/?text=%60%60%60name%20a%0Acontent%0A%60%60%60%0A%0A
I've revised my note above.
Split by spaces and make them classes? What if there is punctuation not normally allowed in classes?
Do you need to split it at all? Just have the whole string be the lang
For div
, hmmm; firstly I would ask, is there really a need to have this "store the first word as a class" semantics?
You already have block attributes for setting classes, why not just store ithe whole string under a key as well and be done with it ๐
(thanks as always for the rabid rapid replies!)
rabid ๐
Pandoc doesn't store these in the lang
attribute; it adds them as classes. (This is the way it has always behaved, and changing it now is probably not a good idea.)
rabid ๐
๐ ๐คฆโโ๏ธ
Pandoc
This is the way it has always behaved, and changing it now is probably not a good idea
Does djot have to do what pandoc does though?
It seems like Pandoc does not follow commonmark here ๐ค https://spec.commonmark.org/0.30/#info-string
Pandoc's commonmark and gfm and commonmark_x parsers will ignore the additional content in the case of code blocks. This could be modified to store the whole line in an info
attribute, or perhaps to do so only if this content would differ from the class already stored.
Pandoc's markdown parser is different. Part of the motivation here is to avoid confusing inline code that happens to start at the beginning of the line and uses three backticks with a code block.
Part of the motivation here is to avoid confusing inline code that happens to start at the beginning of the line and uses three backticks with a code block.
It feels like, if you have "committed" to writing three backticks at the start of the line, then you are expecting to write a code block.
I can't imagine there being any time that you actually want this as inline?
Note, commonmark prohibits backticks being in the info string (then it is parsed as inline), so you can still write inline:
```inline something```
just not
```inline something
```
``` ``Markdown code spans with ` inside them`` ``` can be quoted with ` ``` `.
Let's see how GH renders it:
``Markdown code spans with ` inside them``
can be quoted with ```
.