Support HEEx/Surface markup
jonatanklosko opened this issue Β· 40 comments
We need to handle these syntaxes. These should be standalone grammar handling .heex/.surface files, but we will also use tree-sitter injection to parse the relevant sigil nodes.
@connorlay already worked on both grammars, so we can just integrate them.
Regarding code organisation, on one hand it seems like a tree-sitter convention to keep all grammars as separate repositories, but separating heex/surface feels quite scattered. I saw that in tree-sitter-typescript they keep typescript and tsx grammars in separate directories, but they are bundled together (as the node/rust package), however that's a rather specific case because these are two variants of the same language and the grammars share most of the code. Since we are dealing with separate grammars, I feel that we should have separate tree-sitter projects, but we could place them in a single repo, each in its own directory. I'm just not sure if this will work well when installing the package directly from GitHub (which seems GitHub does). That said, I would be totally ok with separate repositories, just wouldn't want to spam elixir-lang with these :)
My main concern about being a single repo or multiple repos is about further functionality. If we implement "Go to definition", what are the odds we want to share functionality between them? Especially when HEEx and Surface templates do allow calling local functions?
There is also EEx, which is definitely part of Elixir. If we want to support it, would we make it a separate language package? That sounds a bit too much to me.
If we are not sure, we can also wait until we start "Go to definition" before making a decision.
To support code navigation for <Mod.fun>
I think we would parse Mod.fun
into a single node and then use injections to parse it as Elixir code, which would be a regular function call. Not sure about <.fun>
though, if we skip the dot and parse fun
then it's gonna be an identifier rather than call π€ Either way, we certainly want separate heex/surface grammars, hence injections seems like the only reasonable way to integrate go-to-definition there.
As for sharing functionality, as far as I understand we need to add Elixir support in github/semantic, so go-to-definition code will live outside tree-sitter-elixir anyway.
There is also EEx, which is definitely part of Elixir. If we want to support it, would we make it a separate language package? That sounds a bit too much to me.
There is already an official tree-sitter-embedded-template grammar for ERB and EJS, we could probably reuse that given how similar EEx is?
Alright, so it is either three directories in this repo or three separate repositories. Both are fine by me. It depends on how many colocated changes we will have to perform.
There is already an official tree-sitter-embedded-template grammar for ERB and EJS, we could probably reuse that given how similar EEx is?
Sounds good to me!
Alright, so it is either three directories in this repo or three separate repositories. Both are fine by me. It depends on how many colocated changes we will have to perform.
I don't expect any collocated changes, as the only integration point should be the injection queries.
So I'd lean towards separate repositories to adhere to the tree-sitter convention.
So perhaps it is best to start a discussion about moving all of those to the tree-sitter organization. Or alternatively we move all of them to elixir-editors (or we start an elixir-tree org).
Ideally we would move to the tree-sitter org yeah, otherwise the elixir-editors
sounds good to me (just so we don't have elixir-tree-sitter/tree-sitter-elixir
^^).
Alright, so let's give tree-sitter-elixir a try on neovim, and, if solid, we should request GitHub to move to tree-sitter-elixir/surface/heex for syntax highlighting.
Hey all, I've been busy the past few days and now am catching up on this thread :)
To support code navigation for <Mod.fun> I think we would parse Mod.fun into a single node and then use injections to parse it as Elixir code, which would be a regular function call. Not sure about <.fun> though, if we skip the dot and parse fun then it's gonna be an identifier rather than call thinking Either way, we certainly want separate heex/surface grammars, hence injections seems like the only reasonable way to integrate go-to-definition there.
This is the approach I took when integrating Surface with Neovim. I suggest we try to keep the HEEx/Surface/EEx grammars as similar to each other as possible to keep the maintenance overhead low when wiring up these injections.
As for sharing functionality, as far as I understand we need to add Elixir support in github/semantic, so go-to-definition code will live outside tree-sitter-elixir anyway.
I found this merged PR that added Rust support that illustrates what might be involved integrating Elixir into semantic. I think tag definitions are what drive the code navigation capability?
Alright, so let's give tree-sitter-elixir a try on neovim, and, if solid, we should request GitHub to move to tree-sitter-elixir/surface/heex for syntax highlighting.
Can GitHub move to tree-sitter-elixir for syntax highlighting before semantic support is added? Or is semantic integration blocking that as well? Either way, I'll get started on a branch migrating Neovim to the new grammar and see how that turns out.
Can GitHub move to tree-sitter-elixir for syntax highlighting before semantic support is added?
From my understanding, yes. :)
Either way, I'll get started on a branch migrating Neovim to the new grammar and see how that turns out.
Yes, beautiful.
I think tag definitions are what drive the code navigation capability?
I think so, although code navigation is not supported for Rust at this point. I think the Rust tags are missing the normalization parts, see Ruby/Tags.hs. Generally Ruby seems like a minimal example, it has just three files, and AST.hs is automatically generated, so it's pretty much just Tags.hs as far as I can see.
@josevalim My PR migrating Neovim to the new parser is open for full review now nvim-treesitter/nvim-treesitter#1904 π
@connorlay beautiful! π
Hi @connorlay, quick question: do you think we are ready to ask the GitHub team to move Elixir and Surface (and HEEx?) upstream?
@josevalim For Elixir, I think we are ready to ask GitHub to switch over. @jonatanklosko I'm curious to hear what you think.
For Surface and HEEx, I have a couple questions:
- How often does GitHub update their parser versions? If new syntax is introduced to the parsers, what does the upgrade path look like? Surface and HEEx are pre-1.0, so I expect more changes will occur to those compared to Elixir.
- I wrote both Surface and HEEx parsers initially for neovim, which utilizes additional query capture definitions than what tree-sitter provides alone. Will we need to port the highlight queries from nvim-treesitter to tree-sitter-surface and tree-sitter-heex before GitHub can utilize them for syntax highlighting? @the-mikedavis has a PR integrating HEEx with the Helix editor that might solve this for us.
- If GitHub is going to use the Surface and HEEx parsers, would it make sense to transfer the those repositories out of my personal namespace and to the elixir-lang organization? I am happy to do so as long as I retain my status as a maintainer π
@connorlay my plan is to ask to move all of them to the tree-sitter org. I will ask your questions upstream and follow up! :)
@connorlay as for HEEx the only thing that I'm unsure about is how to parse <.tag>
, if we want to use injection we need to parse "tag" into it's own node. But then, even changing the AST at any point should probably be fine, because the highlight queries will live in the same repo and can be updated together.
@josevalim when you get a chance please also ask if it's actually possible to integrate jump-to-definition with injections. I was looking at an example Rails repo and references don't seem to work for inside ERB templates, so maybe there's some limitation to that?
as for HEEx the only thing that I'm unsure about is how to parse <.tag>, if we want to use injection we need to parse "tag" into it's own node. But then, even changing the AST at any point should probably be fine, because the highlight queries will live in the same repo and can be updated together.
@jonatanklosko that is a good point. The existing tests for the HEEx and Surface parsers are pretty bare and could use some work to cover more scenarios. I'm not sure how much time I'll have this week to dedicate to this π€
That reminds me: we will also want to check to see if the HEEx parser works with the new slots feature.
Hi, popping into this discussion. I have a friend who works on the jump-to-definition stuff at GitHub. He says if the highlighting queries are ready, he can submit a proposal to integrate it into the highlight service.
Do y'all already have a GitHub contact you're working with to try to get things moved down the pipe?
Hi @maco! Help on this front would be appreciated. The highlight queries is ready but we do have one final question about repository organization. In particular, we are wondering how jump-to-definition works across multiple repositories.
For example, take https://github.com/tree-sitter/tree-sitter-embedded-template. In Ruby, you could do <%= SomeModule.method(123) %>
. However, there is currently no go-to-definition for those. I am wondering if this is because this was not implemented, if this is a limitation of how injections work, or if it is a limitation on the GitHub side of things.
The reason we ask is because we have a similar setup. We have the Elixir language and we have our embedded templates, and we want to make sure our go-to-definitions will work for those embedded templates too. Maybe we need to keep all of those in the same repository and, if that's the case, we want to merge the repositories before we send it upstream!
Good news, someone from GitHub reached out and they will enable tree-sitter-elixir on GitHub. They told me that both injections and definitions should work across repositories, so there is no reason to put everything together in the same repository.
We should tackle embedded template (.eex) definitions for Elixir and go-to definition in the future.
We should also work to have official integration for both HEEx and Surface on GitHub. @connorlay, please let us know if you need help on this front.
Also, if you want to move tree-sitter-heex to the Phoenix org, we would be glad to welcome you there! :)
@josevalim Exciting! To clarify, is GitHub now using tree-sitter-elixir and tree-sitter-heex, or just tree-sitter-elixir?
I'll get ready to move tree-sitter-heex to the Phoenix organization π
@connorlay only tree-sitter-elixir because I believe the highlight queries are not ready for heex and surface? Or have I misunderstood?
@josevalim No you are correct, I still need to port the highlight queries from nvim-treesitter to tree-sitter-heex and tree-sitter-surface before we can integrate them with GitHub. Thanks!
@josevalim @jonatanklosko Thanks to the efforts of @the-mikedavis , we now have highlight queries for tree-sitter-heex! Is this ready to send to GitHub?
I think on the heex side the queries are ready but we should probably await a resolution to #24 so that the injections are in good shape as well
Revisiting this thread, do folks think it would be worth moving forward with adding tree-sitter-heex
injections despite #24 reported as a known issue? I'm asking because today there is no syntax highlighting at all for ~H
sigils in GitHub, which is making it difficult to review pull-requests for my team at work. Enabling language injection and syntax highlighting would be a good first step towards improving the Live View developer experience in GitHub, knowing that the scenarios detailed in #24 might look off when highlighted.
@jonatanklosko @the-mikedavis @josevalim I am happy to work on the PR to add tree-sitter-heex
injections if you are comfortable sending it to GitHub once ready π
I think it would be a good increment to start using the heex grammar now, especially since there currently isn't ~H
highlighting π
The cases in #24 might be pretty hard to fix so I think we shouldn't hold off for their sake. (Also thanks for reminding about it, I was dragging my feet on investigating the scoped combined injections part π )
What do you think, Jonatan and JosΓ©?
Letβs go ahead! Having HEEx support with go to definitions for components will be glorious!
We can definitely move to tree-sitter-heex for syntax highlighting. As of right now, we donβt do any code navigation from templates, but weβve just added it to our roadmap, and Iβll be sure to get in touch when weβve verified that everything is working. (Thereβs nothing about our data model that should forbid code nav in templates, but weβd probably want to get it working for ERb first so we can dogfood it.)
phoenixframework/tree-sitter-heex#11 and #34 are open and ready for feedback π
@patrickt Those two PRs have now been merged π Is there anything else you need from us to enable tree-sitter-heex for syntax highlighting in GitHub?
Should tree-sitter-heex be moved under the phoenixframework org first?
Should tree-sitter-heex be moved under the phoenixframework org first?
Yes I think so π
To transfer a repository that you own to an organization, you must have permission to create a repository in the target organization.
According to the GitHub documentation, I need permission (does that mean membership?) to create a new repository under the phoenixframework organization.
@josevalim I think you are the only participant in this thread that is a member of the phoenixframework organization, how do you want me to handle transferring ownership?
@connorlay please transfer to me and I can transfer it to the org. :) Who should have access in the org besides you, @connorlay?
@josevalim I just submitted the transfer, you should see a request appear soon.
Who should have access in the org besides you?
@the-mikedavis if interested?
Transferred and invited. You also got access to other editor related projects, but feel free to ignore them. :)
@connorlay Iβve added it to our roadmap. It hopefully shouldnβt take long, but is probably going to be slightly more involved because we have to take care of the template injection.