Parsing failure in CSharp script (.csx) file with `#r` directives
blazkowolf opened this issue · 2 comments
I haven't actually looked into the code for this myself, but I imagine it's some kind of edge case with the tree-sitter grammar causing this anomaly. When using #r
directives at the top of a csx
file, the highlighting provided by tree-sitter breaks, and self-recovers part way down the source file.
Then a little ways down the file, you can observe the highlighting kicks back in.
Then with the #r
directives commented out, the highlighting works as expected.
My apologies in advance if this is the incorrect forum for an issue like this.
Options
We have a couple of options here:
- We add support for these .csx specific preprocessor instructions
This isn't difficult other than I can't find a specific official list of them anywhere. They would be coloured correctly but also accepted as valid in normal C# files. Likewise we might allow stuff in CSX files that isn't valid - again can't find spec.
I think this is what Roslyn is doing as it knows the "#r" syntax even inside .cs files and then tells you its only valid in scripts. We don't have this second-level of flagging problems in tree-sitter.
- We add a "bad directive" preprocessor instruction
Roslyn also does this to handle all sorts of scenarios and lets it recover from misunderstood preprocessor directives. This should be easy to do and would mean we continue to nicely parse a whole lot of invalid scenarios, e.g. putting #exit
in a file. Roslyn again allows this so the whole file looks nicely highlighted and uses this second-level to flag that there is no such known directive.
- We nest/extend the C# syntax from a new C# Script syntax
I don't know how to do this but I suspect it's possible and that examples likely exist in other language tree-sitters.
Going forward
Right now I'm tempted to do option 2 as it opens up a whole lot of recovery options. It would be great if there was a way of indicating a parsing rule is there for recovery but isn't itself valid. Is this possible @maxbrunsfeld ?
I can't find a specific official list of them anywhere
I don't know if there's an official list that includes the scripting directives, but here is the code in Roslyn that parses directives (don't miss the handling of #!
at the end) and here is the code that maps directive names to the SyntaxKind
used by the parser.
Though you may already know that.