jgm/pandoc

Doesn't generate link ID if convert to markdown_strict when using link-citations.

Watterry opened this issue · 1 comments

Discussed in #9729

Originally posted by Watterry May 7, 2024
I use link-citations: true to generate Hugo markdown( or you can say markdown_strict), the command I am using is:

pandoc --citeproc --bibliography reference.bib --csl ieee.csl -t markdown_strict -o result.md index.md

I am using "link-citations: true" in my index.md file to generate citation links. There is a strange thing here, the reference list generated by pandoc doesn't contain any ID property to let linking. Here is the result:

<span class="csl-left-margin">\[1\]
</span><span class="csl-right-inline">“VikParuchuri/marker: Convert PDF
to markdown quickly with high accuracy.”
<https://github.com/VikParuchuri/marker>, 2024.</span>

<span class="csl-left-margin">\[2\]
</span><span class="csl-right-inline">“Top 5 google scholar APIs and
scrapers in 2024.”
<https://blog.apify.com/best-google-scholar-apis-scrapers/>,
2024.</span>

If I generate HTML result of the above markdown file, the result will contain ID property:

<div id="ref-VikParuchuri-marker" class="csl-entry" role="listitem">
<div class="csl-left-margin">[1] </div><div class="csl-right-inline"><span>“VikParuchuri/marker: Convert PDF to
markdown quickly with high accuracy.”</span> <a href="https://github.com/VikParuchuri/marker" class="uri">https://github.com/VikParuchuri/marker</a>, 2024.</div>
</div>
<div id="ref-Top-2024-05-06" class="csl-entry" role="listitem">
<div class="csl-left-margin">[2] </div><div class="csl-right-inline"><span>“Top 5 google scholar APIs and scrapers
in 2024.”</span> <a href="https://blog.apify.com/best-google-scholar-apis-scrapers/" class="uri">https://blog.apify.com/best-google-scholar-apis-scrapers/</a>,
2024.</div>
</div>

So I'm curious, in the markdown output, if there is no ID attribute, in the final HTML page rendered by Hugo, there is no way for the citation of a reference to jump to the corresponding entry in the Reference list. If I want to make sure that a reference citation jumps to the corresponding reference entry, I need to manually put in an ID similar to the HTML output above.

So, is there any way to write this ID into the span tag when using the link-citations parameter to convert the markdown file?

Also, I'm wondering, is this considered a bug?

jgm commented

I'm sorry, I missed that you're converting to markdown_strict. That's why the id isn't coming through. There's no very good way to include it.

In the AST, the ID is on a Div. The way to preserve the semantics would be to render the whole Div as HTML, but since markdown_strict doesn't allow markdown inside HTML blocks, that would be quite ugly. So I don't know if that is desirable?

Looking just at this case, of course you suggest putting the id on the initial span. But the writer is a general-purpose renderer; so the question is, how are we in general going to handle a Div with an id when rendering markdown_strict? The Div might not always begin with a Span, and might not even begin with a Para.