lierdakil/pandoc-crossref

How to make wrapping `<div>` not break a paragraph?

UlyssesZh opened this issue · 19 comments

test.yml:

tableEqns: true
eqnBlockTemplate: |
  <table><tr><td>$$t$$</td><td>$$i$$</td></tr></table>

test.md:

test
$$x$$
test
$$y$$ {#eq:test}
test

Run:

pandoc -f markdown -t html5 -M crossrefYaml=test.yml -F pandoc-crossref test.md

Output:

<p>test <span class="math display"><em>x</em></span> test</p>
<div id="eq:test">
<table>
<tr>
<td>
<span class="math display"><em>y</em></span>
</td>
<td>
<span class="math display">(1)</span>
</td>
</tr>
</table>
</div>
<p>test</p>

Expected:

<p>test <span class="math display"><em>x</em></span> test <div id="eq:test">
<table>
<tr>
<td>
<span class="math display"><em>y</em></span>
</td>
<td>
<span class="math display">(1)</span>
</td>
</tr>
</table>
</div> test</p>

Here is the reason why I need this. I am trying to use CSS to have indents on the first line of each paragraph (like p { text-indent: 2em; }). Generally I want to control whether the texts after a displayed equation should be a new paragraph or not, so I don't want the wrapping <div> to break the paragraph if I don't include an empty line in the Markdown.

In the code snippet above, we can see that the behavior of displayed equation is already expected by me when it does not have a label. Therefore, this is not a bug of pandoc.

Highly recommend the complex-paragraphs lua filter here. Works for both latex/pdf and docx outputs.

I need HTML. Also, because there are literally too many "complex paragraphs" because I have many articles with displayed math, it is impractical to mark every complex paragraphs by hand.

This is a limitation of Pandoc's document model. A paragraph can't contain divs. So... yeah, practically speaking, you don't.

Side note, when targeting HTML with MathJax or somesuch, tableEqns isn't necessarily what you want. Using \tag (see here) produces generally better results, typographically speaking. In case you weren't already aware.

What about <span>?

I use KaTeX, so \tag is not available.

The problem is, tables are also block-level elements, so those can't be in paragraphs either, as far as Pandoc is concerned.

That being said, try eqnBlockInlineMath: true. It's a dirty hack, but since you're using inline HTML anyway, you're not particularly concerned about those.

Ah, no, sorry, that option does something different.

Meh. I don't have a ready solution, and I don't have the bandwidth to implement something at the moment.

Is it convenient to add an additional class (such as class="not-closing-paragraph") to the div whenever it is in the same paragraph as the block or text after it? I can then use something like .not-closing-paragraph + p { text-indent: 0; }.

The problem is, tables are also block-level elements, so those can't be in paragraphs either, as far as Pandoc is concerned.

Hmm, but this is not table. This is inline raw HTML.

You could arguably slap together a lua filter to postprocess pandoc-crossref's output. My brain is toast at the moment, so I won't try to write code, but the idea is to set eqnInlineTemplate: $$e$$☸$$i$$ and then use Lua to basically replace all equations that contain ☸ (or use any other symbol) with your raw HTML block (splitting on the symbol). Should be relatively straightforward? Can't recall from the top of my head if Lua deals with Unicode properly, though, so the Unicode symbol I'm proposing might not be your best bet.

add an additional class

That's doable, but, again, no bandwidth to spare at the moment. I'll accept a PR. You'll want to add a class here:

split res acc (x@(Span _ [Math DisplayMath _]):ys) =
split ([x] : reverse (dropSpaces acc) : res)
[] (dropSpaces ys)
(that span is converted to a div elsewhere because tables are block-level elements)

(for additional context, the classes are in the second argument of Span, together with id and key-value attributes)

Thank you for confirming that this is feasible, but I know nothing about Haskell. I will try the method of an additional filter first.

Errr, <table> simply cannot be nested in <p>.
https://stackoverflow.com/a/9852381
Though I imagine I may just use <span> for everything instead. The original issue still exists.

Ah. Right, I keep forgetting <p> isn't just a <div> with default styling. But with HTML specifically, flexbox would yield better results anyway.

Currently my workaround is to use this template and use an additional filter to detect and replace this template with spans. This workaround looks good for now, but I still think this is something that pandoc-crossref should and can do.

I've pushed something that seemingly approximates what we've discussed in f15f77c. Mostly this boils down to adding an inline template for the case when tableEqns: false. I believe you can more or less achieve the desired result with something to the tune of this example:

---
eqnDisplayTemplate: |
  <span style="display:flex;align-items:baseline">
      <span style="flex-grow:1">$$e$$</span>
      <span style="flex-grow:0">$$i$$</span>
  </span>
eqnInlineTemplate: $$e$$
header-includes:
  - '<style>p { text-indent: 1em; }</style>'
---

This is perhaps one of the most famous equations:
$$e^{i\pi} = -1,$${#eq:euler}
named after Leonard Euler. However, in accordance with The Arnold Principle, it
was not discovered by him, or at least there are no written sources to
corroborate this attribution.

produces this:

image

I've used MathJax, but I imagine it'll behave well enough with KaTeX, too. EDIT: regenerated with KaTeX.

Note eqnInlineTemplate: $$e$$ is important, otherwise you'll get the equation number twice. Unfortunately, I found no way to work around this quirk without introducing some non-obvious implicit behaviour.

Would be nice if you could test-drive and make sure it suits your requirements before I cut a release. If CI gods are not angry today, you should be able to grab binaries from https://github.com/lierdakil/pandoc-crossref/releases/tag/nightlies in about an hour (look for archives with f15f77c at the end of the name)

I tried the nightly release. It works well. Good work!

Thanks for testing! I'll cut a new release probably later this week, hoping to squeeze another (unrelated) feature in there if stars are right.