jgm/skylighting

Bash parse error: trailing bracket parsed wrong as 'error' class

gwern opened this issue · 2 comments

gwern commented

I was reinstalling Pandoc/skylighting from HEAD last week, and that has led to a crop of sudden appearances of the "er" syntax highlighting class on code that ran fine when I wrote them*, and didn't have any syntax highlighting issues before. Looks like it's only 'Bash' syntax, and it seems to usually involve '>' and periods in identifiers.

* since I don't use "er" in any code samples deliberately, I have a grep test to make sure none appear in generated outputs

The simplest example I could construct by hand:

f() {
    echo > f
}

(This is valid Bash and executing f simply creates an empty file named f.)

~~~{.Bash}
f() {
    echo > f
}
~~~

pandoc -w html

<div class="sourceCode" id="cb1"><pre class="sourceCode Bash"><code class="sourceCode bash"><span class="fu">f()</span> <span class="kw">{</span>
    <span class="bu">echo</span> <span class="op">&gt;</span> f
<span class="er">}</span></code></pre></div>

xwd-161574398727158

Delete the > and the final bracket gets correctly parsed as <span class="kw">}</span>. This can be simplified further to >f but I'm not sure whether that's really sensible Bash; it does suggest that the error is triggered by a '>' with anything after it.

(One caveat here: to work around the self-linking numbering-lines in skylighting, which completely screws up gwern.net CSS and can't be disabled in any way even when numbered-lines are not ever used, I manually edit skylighting to delete the H.span stuff from sourceLineToHtml; I can't see how that patch would screw up only a handful of my Bash code samples, but I mention it just in case.)

jgm commented

This bit of trace output shows what I think is the problem: it looks like we're doing one Pop too many on line end. We should end up still in CommandArgs, I think. So my guess is that there is a problem in the logic where checkLineEnd is run.

RegExpr MATCHED Just (NormalTok,"f")
CONTEXT STACK ["PathThenPop","WordRedirection2","CommandArgs","Group","Start"]
checkLineEnd for "PathThenPop" eol = True cLineEndContext = [Pop,Pop]
checkLineEnd for "WordRedirection2" eol = True cLineEndContext = [Pop]
checkLineEnd for "CommandArgs" eol = True cLineEndContext = [Pop]
CONTEXT STACK ["Group","Start"]

Never mind, I don't think that's the issue. This is fine, I just misinterpreted it.

jgm commented

Real problem is shown just a bit further down:

CONTEXT STACK ["PathThenPop","WordRedirection2","CommandArgs","Group","Start"]
checkLineEnd for "PathThenPop" eol = True cLineEndContext = [Pop,Pop]
Doing context switches for [Pop,Pop]
checkLineEnd for "WordRedirection2" eol = True cLineEndContext = [Pop]
Doing context switches for [Pop]
checkLineEnd for "CommandArgs" eol = True cLineEndContext = [Pop]
Doing context switches for [Pop]
CONTEXT STACK ["Group","Start"]
CONTEXT STACK ["Group","Start"]
CONTEXT STACK ["Group","Start"]
CONTEXT STACK ["Start"]

We shouldn't be exiting from the Group context til we hit the }.