tree-sitter/tree-sitter-php

Html closing tag in Php file

dvandyy opened this issue · 6 comments

I ran into some issues with highlighting in Php combined with Html. Everytime when I write Php in Html element, closing tag is not highlighted. I am new into all of this 'parsers / treesitter' stuff so I don't know if its a bug or maybe I am doing something wrong.

Screenshot 2022-02-21 at 18 22 06

See #119. I'd say it's basically the same issue. It only happens when you have a <?php ... ?> block between the opening and closing tag. AFAIK there's no solution yet to this problem.

Can you open this issue on Neovim or Nvim-treesitter? This repository is just providing a PHP parser, and is not controlling the way that HTML and PHP are combined in Neovim specifically.

I opened issue here cause this was in treesitter repo:

I have inspected the syntax tree using https://github.com/nvim-treesitter/playground and made sure
that no ERROR nodes are in the syntax tree. nvim-treesitter can not guarantee correct highlighting in the
presence of ERRORs -- in this case, please report the bug directly at corresponding parser's repository. (You can find all repository URLs in README.md.)

Sry if its not related to this.

@maxbrunsfeld this seems to be a parsing issue. The parser tree shows no ERROSs but is clearly not correct (text_interpolation range is selected)

image

Our playground also shows contents of injected languages which show errors as the injection ranges by the parent language seem to be a bit off.

Disclaimer: I don't know anything about PHP. I'm just assuming that the reported code is supposed to be valid PHP code (and not HTML. The code is also not parsed validly by the HTML parser).

EDIT: apparently this is just a script parsing issue in the HTML parser. Sorry for any noise!

This is the expected behavior. In PHP, you can insert a ?> anywhere within the program, and everything after that is treated as arbitrary text, until the following <?php tag. So we model that behavior using a node called text_interpolation, which begins with a ?>, and ends with either EOF or a <?php. A text_interpolation is treated as an extra node, like a comment, which can appear anywhere in the tree. In that way, we're able to still produce a tree structure that describes an arbitrary PHP document.

To syntax highlight PHP, you just need to combine all text nodes' ranges and parse them as a single HTML document. That is what GitHub does for its syntax highlighting, see #119 (comment).

Note that the one bit of complexity for GitHub's highlighting is that, inside of a fenced code block, GitHub inserts an implicit <?php tag at the beginning.

We discovered that issue earlier that we don't exclude child nodes of the parent language from the injection ranges (nvim-treesitter/nvim-treesitter#2557) to get nested behavior. The implicit <?php is also something we need to think about whether this could also applied in Neovim.