jgm/djot.lua

Consider providing APIs for parsing only inlines

matklad opened this issue · 1 comments

I've found a couple of cases where I'd love to ask djot to parse only inline elements. This might be an XY problem, so let me describe specific use-cases.

First is the formatting in code blocks:

```
dtw[i, j] =
    min(dtw[i-1, j-1], dtw[i, j-1], [i-1, j]) + (xs[i] - ys[i])^2^
```

Here, I'd love to render this as a verbatim block, except for superscript 2

Second is formatting in the attributes:

{caption="Code with _inline markup_"}
```adoc
[source,rust,subs="+quotes"]
----
let x = 1;
let r: &i32;
{
    let y = 2;
    r = [.hl-error]##&y##;  // borrowed value does not live long enough
}
println!("{}", *r);
----
```

Here, as we don't have dedicated syntax for captions yet, I want to use caption= attribute, and the value of that attribute is a djot inline.

The way asciidoctor solves this is by allowing to control the parser's behavior from the attributes (amusingly, the example above shows asciidoctor example of this feature: +quotes enables processing of some inlines for the following code block).

I don't like this solution: it feeds semantic information back into the parsing, which is a the-lexer-hack-shaped layering violation.

What I'd love to do here is to parse this as a normal string-valued attribute/verbatim block, and then let it to the conversion layer to recursively feed those strings too djot for processing. And that's why I think I want an API to parse only inlines! Implementation wise, I can of course just manually call the relevant lua function, but I think making it accessible via cli would signal that this is an official way to use djot, and something that the spec and other impls should support.

Learning from asciidoctor experience, which has a lot of toggles to control what exacly is parsed, I think we also might want this to be more fine-grained than just inlines. "Just inlines" would work perfectly for "caption as an attribute" use case, but for code block I'd probably want to disable _ and *, as those are very likely to need escaping.

jgm commented

I'm certainly open to this. Actually, I'm not at all happy with the current API of the lua implementation, and I plan to work on that soon.