elixir-lang/tree-sitter-elixir

Emacs tree-sitter support

wingyplus opened this issue Β· 31 comments

It would be interesting if we have support for Emacs. Currently, I'm setting Emacs and tree-sitter using https://github.com/emacs-tree-sitter/elisp-tree-sitter to work on my local machine. I can help try integrating on it. :)

Sounds good, feel free to create an integration :D Is there anything necessary on this end to support it?

Sounds good, feel free to create an integration :D Is there anything necessary on this end to support it?

Currently, I'm looking at https://github.com/emacs-tree-sitter/tree-sitter-langs to see how to implement the query file to make syntax highlight work. I'll try on tomorrow or Monday and will comment the progress in this issue. :)

Cool! For some existing queries see highlights.scm in this repo and also nvim-treesitter/nvim-treesitter#1904. You probably can reuse most of that and just change the @annotations to whatever is expected :)

I just opened a PR to point at this repo instead of the now archived elixir grammar. emacs-tree-sitter/tree-sitter-langs#51

Sorry for not responding for a long time. I've trying with highlight.scm from nvim-tree-sitter. I tried it with lib/elixir/lib/code/formatter.ex in elixir repository and the result looks like:

Before apply tree-sitter (syntax highlight from elixir-mode):

Screen Shot 2564-11-13 at 23 50 34

Screen Shot 2564-11-13 at 23 50 51

Screen Shot 2564-11-13 at 23 51 05

After applied tree-sitter:

Screen Shot 2564-11-13 at 23 51 26

Screen Shot 2564-11-13 at 23 51 38

Screen Shot 2564-11-13 at 23 52 17

I found an issue that some atoms in required_parens_on_binary_operands got highlighted and some not, I'm not sure why. But overall looks good.

Hmm, the module attributes look off, and some of the keywords like def, case are not highlighted properly. You can play around with queries order, I think the built-in highlighter always uses whatever matches first (ordering from most to least specific queries), while nvim-tree-sitter works the other way round (subsequent highlightings take precedence).

FTR here's what I get using the built-in highlighter (npx tree-sitter highlight) with these queries.

image
image

Thanks for your advice. The syntax highlighter is pretty straightforward than I thought.

Screen Shot 2564-11-14 at 14 58 37

Screen Shot 2564-11-14 at 14 58 11

Screen Shot 2564-11-14 at 14 57 51

I opened a branch on my fork here. I need help testing before sending a patch to upstream. :)

Update the progress. I sent a PR to the tree-sitter-langs package to register a highlight query here.

I am playing with the feature/tree-sitter branch and have some version working with font-locking, indentation and navigation. I don't know tree-sitter very well and have not worked a lot on major modes, so not sure if what I am doing is correct. Maybe we can merge some version of this into the elixir-mode repository for when emacs 29 is out?

It is ready enough for me to use it during the week and tweak an iron out: https://github.com/wkirschbaum/elixir-mode/blob/main/elixir-mode.el

Its got some advantages over the current font-lock, but there seems to be some strange issues which I am looking at atm. Here is an example:
image

Not sure if this is what you’re looking for for the moduledoc issue https://github.com/wingyplus/tree-sitter-langs/blob/elixir-queries/queries/elixir/highlights.scm

@wingyplus i see the arrows are misleading. i meant those as an advantage, not an issue. I used the highlights file to generate the treesit-settings function, it is almost the same except for some tweaks.

There is a moduledoc issue, but seems to be either emacs treesit bug ( i see there are some fixes still rolling in ) or a elixir treesitter bug. I will investigate a bit this week.

This latest commit I made seems to have solved 95% of the issues with highlighting, it seems to be working very well.

Ah I see. If we can modify string after doc attributes (moduledoc, doc, typedoc) into a comment that’s would be great.

but there seems to be some strange issues which I am looking at atm. Here is an example

Some of these are issues that I see almost everywhere that I see tree-sitter-elixir being used. Like doc comments not being treated as comments.

For example:

It makes other editors I want to try unusable except for VSCode because it drives me nuts (I write a lot of docs in my code).

e.g. zed-industries/zed#5865

I wish I knew what GitHub was doing to fix it on their end, since they apparently also use tree-sitter-elixir, and handle the @doc, @moduledoc, and @typedoc syntax highlighting as comments.

Hi @ryanwinchester, I think wkirschbaum/elixir-ts-mode has already marked it as comment style. 😎

https://github.com/wkirschbaum/elixir-ts-mode/blob/main/elixir-ts-mode.el#L70

Is this not something that tree-sitter-elixir could fix, or does every editor that uses it have to set that up on their own?

In fact, the highlight provided by tree-sitter-elixir also marks it as comment style.

https://github.com/elixir-lang/tree-sitter-elixir/blob/main/queries/highlights.scm#L8

So maybe those editors don't set the correct colors for the comment.doc style.

Weird. It doesn't look correct in here either: https://elixir-lang.org/tree-sitter-elixir

Because it uses a simple version of the highlighting rules.
You can copy the complete highlighting rules I mentioned above into the lower left corner of the demo web page.

Because it uses a simple version of the highlighting rules. You can copy the complete highlighting rules I mentioned above into the lower left corner of the demo web page.

Yes I did, and it is still not correct for me. Does it work for you?

@ryanwinchester tree-sitter-elixir handles the parsing part, that is turning code into meaningful nodes. The rules for highlighting nodes are generally up to the user (or in this case the editor projects). In this repo we have queries/highlights.scm, which is what GitHub uses and we explicitly mark docstrings as comments:

; * doc string
(unary_operator
operator: "@" @comment.doc
operand: (call
target: (identifier) @comment.doc.__attribute__
(arguments
[
(string) @comment.doc
(charlist) @comment.doc
(sigil
quoted_start: _ @comment.doc
quoted_end: _ @comment.doc) @comment.doc
(boolean) @comment.doc
]))
(#match? @comment.doc.__attribute__ "^(moduledoc|typedoc|doc)$"))

Weird. It doesn't look correct in here either: https://elixir-lang.org/tree-sitter-elixir

The demo page isn't meant for highlighting, it's a playground where you can see how the code gets parsed and then match queries against the parse tree. The colors are only to distinguish the query results (and they are arbitrary).

In this repo we have queries/highlights.scm, which is what GitHub uses and we explicitly mark docstrings as comments

Thanks, so it's easy-mode for editors if they use that? Thanks :)

Thanks, so it's easy-mode for editors if they use that?

Yeah, when integrating with the given editor you'd use pretty much the same query and just make sure to use node names that the editor understands for highliting (e.g. @comment).

Here is an example in Nvim :)

this is what it currently looks like in elixir-ts-mode ( modus-vivendi )
image

Should comment use the same font as doc comments?

i don't know about other editors, but in emacs it is really easy to modify the faces. giving fever options is not better.

Maybe we're getting into subjective territory but as long as it's comment-like and not rendered like a normal module attribute + string, it's good. Especially if you can just configure it yourself easily.

Personally I prefer it to just be the same as comments. Like how GitHub and VS Code handle it:

@doc "foobar"

@moduledoc """
foobar
"""

# comment

"string"

~S"""
foobar
"""

@foo "attribute string"

@ryanwinchester nothing stops you in emacs and currently elixir-ts-mode to set them to the same face. if we tag moduledocs with comment then the user won't be able to easilty specify them as two different faces if they want to. I don't think it is subjective, as it is now you have more options to customize your fontification and don't think we should remove that option.

@ryanwinchester nothing stops you in emacs and currently elixir-ts-mode to set them to the same face. if we tag moduledocs with comment then the user won't be able to easilty specify them as two different faces if they want to. I don't think it is subjective, as it is now you have more options to customize your fontification and don't think we should remove that option.

No, I agree.

@wingyplus is there any reason for this to still be open? Is there anything which needs to be done? In the next month or two I will try to get elixir-ts-mode into emacs core and currently we have elixir-ts-mode and heex-ts-mode on MELPA for the next year or so.

@wkirschbaum Thank you for creating those ts-modes. I'm very impressed. I think we have nothing left with this issue. So I'll close it now. πŸ™‡