[Language Request] Ruby
connorshea opened this issue · 13 comments
A grammar already exists for Ruby, so hopefully it wouldn't be too hard to implement? :) https://github.com/tree-sitter/tree-sitter-ruby
I set it up in https://github.com/georgewfraser/vscode-tree-sitter/tree/ruby
But I'm not a ruby user, I don't have a great sense of what good ruby syntax coloring should be. Would you clone this repo, run npm install
then hit F5 to debug, and take a look? The main thing you'll need to modify is the colorRuby
function in extension.ts
. You can use the tree-sitter playground to see what the tree-sitter syntax tree looks like.
I'm kind of out of my depth here as to actually contributing to the colorization, I'm not really sure what exactly should be considered a function
v. field
.
One good example of the colorizer not quite working right is with this code:
def upcase_array(array)
array.map { |item| item.upcase }
end
array = ['hello', 'lowercase', 'strings']
puts upcase_array(array).join(' ')
If you put that in a Ruby file colorized by this extension, it looks like this:
Notably, puts upcase_array(array).join(' ')
acts a bit weird (puts
is a more common equivalent to print
). upcase_array
is a method call and then join
is chained onto it. join
is correctly highlighted as a 'function' (or 'field'? Like I said, I don't think I fully grok what a field is vs. a function), but upcase_array
is not.
This is that line as represented by the tree sitter:
method_call [6, 0] - [6, 34])
identifier [6, 0] - [6, 4])
argument_list [6, 5] - [6, 34])
method_call [6, 5] - [6, 34])
call [6, 5] - [6, 29])
method_call [6, 5] - [6, 24])
identifier [6, 5] - [6, 17])
argument_list [6, 17] - [6, 24])
identifier [6, 18] - [6, 23])
identifier [6, 25] - [6, 29])
argument_list [6, 29] - [6, 34])
string [6, 30] - [6, 33])
identifier [6, 5] - [6, 17])
is theupcase_array
part of the line.
I think this part of the colorRuby()
function needs to be updated to catch both the upcase_array
and join
method call, rather than just the first one:
else if (x.type == 'call' && x.lastChild!.type == 'identifier') {
fields.push(x.lastChild!)
}
If you add another branch to the code, so the function looks like this, the highlighting improves in this case:
function colorRuby(x: Parser.SyntaxNode, editor: VS.TextEditor) {
var types: Parser.SyntaxNode[] = []
var fields: Parser.SyntaxNode[] = []
var functions: Parser.SyntaxNode[] = []
function scan(x: Parser.SyntaxNode) {
if (!isVisible(x, editor)) return
if (x.type == 'method') {
console.log(x.children)
fields.push(x.children[1]!)
} else if (x.type == 'singleton_method') {
fields.push(x.children[3])
} else if (x.type == 'instance_variable') {
fields.push(x)
} else if (x.type == 'call' && x.lastChild!.type == 'identifier') {
fields.push(x.lastChild!)
// Handle additional method calls
} else if (x.type == 'method_call' && x.firstChild!.type == 'identifier') {
fields.push(x.firstChild!)
}
for (const child of x.children) {
scan(child)
}
}
scan(x)
return {types, fields, functions}
}
The tree-sitter with my change:
Sorry this comment is a bit rambly, I figured out how to fix it near the end of the comment and I don't want to rewrite everything :D
As it happens I was working on Ruby colors this morning, how's this?
I just published a new version, you should be able to download it now https://marketplace.visualstudio.com/items?itemName=georgewfraser.vscode-tree-sitter
👍 Highlighting it with yellow is also better since it makes things look less noisy.
Also FWIW, this code would typically use interpolation:
# Before
print "A: ", a.get, " ", b.get, "\n";
# After
print "A: #{a.get} #{b.get}\n"
And Ruby doesn't really use semicolons except for separating calls in single-line code :)
I'm currently comparing the vscode-tree-sitter coloring to the Ruby coloring in Atom, I'll give some more feedback in a bit.
Here a few notable bits of syntax that aren't currently highlighted properly (Atom on the left, vscode-tree-sitter on the right).
This seems like a tree-sitter problem since tree-sitter just marks it as an indentifier, but private
in Ruby can be used to tell a module/class that all subsequent methods are supposed to be private. Can the extension be made to treat it as a keyword?
module Velma
# This method is private
def example_public_method
'test'
end
private
# This method is private
def example_private_method
'test'
end
end
It doesn't seem like 'constants' – as they're called by the tree-sitter – are treated specially in terms of highlighting? e.g. with classes and modules:
require 'uri'
begin
URI.open('https://google.com')
rescue URI::InvalidURIError => e
puts "Error: #{e}"
end
Client.new('test')
Client::Subclient.method('test')
It doesn't look like, in a lot of situations, symbols have any special handling either. Interestingly, using the 'hashrocket' syntax it does highlight things.
hash = {
key1: 'value2',
key2: 'value2'
}
hash2 = {
:key1 => 'value1',
:key2 => 'value2'
}
progress_bar = ProgressBar.create(
total: 'test',
format: "\e[0;32m%c/%C |%b>%i| %e\e[0m"
)
Splat and block parameters aren't highlighted in any special way (I'm not really sure if they should be, but I thought it was worth noting).
# Block parameter
def guests(&block)
puts 'test'
end
# Splat parameter
def guests_array(*array)
puts 'test'
end
It also doesn't highlight class or global variables as variables (@@class_var
, $global_var
), at least not in the same way it does for instance variables (@instance_var
).
class Human
# A class variable. It is shared by all instances of this class.
@@species = 'Homo sapiens'
end
$global = 'this is a global'
I think I fixed all the things you pointed out, please update and see if everything looks right.
Looks pretty good :)
One thing I noticed is that slashes seem to cause problems.
In Ruby, you can use forward slashes for regex, e.g. /^*.+$/
, and I'm not sure if the Textmate language file or the tree sitter is being overly-aggressive with the forward slashes.
I've been using the Learn X in Y Minutes doc for Ruby to test the tree-sitter with Ruby code, and it's a pretty good example of the problem.
Everything in this breaks after the 100.methods.include?(:/) #=> true
line:
# This is a comment
# In Ruby, (almost) everything is an object.
# This includes numbers...
3.class #=> Integer
# ...and strings...
"Hello".class #=> String
# ...and even methods!
"Hello".method(:class).class #=> Method
# Some basic arithmetic
1 + 1 #=> 2
8 - 1 #=> 7
10 * 2 #=> 20
35 / 5 #=> 7
2 ** 5 #=> 32
5 % 3 #=> 2
# Bitwise operators
3 & 5 #=> 1
3 | 5 #=> 7
3 ^ 5 #=> 6
# Arithmetic is just syntactic sugar
# for calling a method on an object
1.+(3) #=> 4
10.* 5 #=> 50
100.methods.include?(:/) #=> true
# Special values are objects
nil # equivalent to null in other languages
true # truth
false # falsehood
nil.class #=> NilClass
true.class #=> TrueClass
false.class #=> FalseClass
# Equality
1 == 1 #=> true
2 == 1 #=> false
# Inequality
1 != 1 #=> false
2 != 1 #=> true
# Apart from false itself, nil is the only other 'falsey' value
!!nil #=> false
!!false #=> false
!!0 #=> true
!!"" #=> true
It can be fixed by adding a forward slash after the existing forward slash, which is why I suspect that's the source of the problem.
Also, FWIW I think symbols should be constant.language.symbol
instead of constant.numeric
, though I don't know how widely constant.language.symbol
is supported by themes?
Right now the symbol itself is colored as a number but the :
is colored as a symbol. Both should probably all be colored as a symbol :)
EDIT: The blue coloring of the :
was apparently caused by the VS Code Ruby extension enabling itself without me noticing, my bad. The symbol color issue still stands, however.
The end
isn't caught here:
case level.to_sym
when :notice then "is-info"
when :success then "is-success"
when :error then "is-danger"
when :alert then "is-warning"
end
It also doesn't seem to be caught in an if / else / end
or if / elsif / end
block.
if user.avatar.attached?
puts 'test'
else
puts 'test'
end
EDIT: These instances of end
not being highlighted are because the theme I'm using doesn't implement keyword.control
, so it's not really a problem with this extension. My bad!
Method parameters also aren't highlighted as far as I can tell (I'm not sure why I failed to notice this before). Should be variable.parameter.function
, I think?
Left is VS Code Ruby, right is Tree Sitter:
def meta_description(description)
return description.presence
end
And there are a decent number of keywords that aren't currently handled in any way: initialize|new|loop|include|extend|prepend|raise|fail|attr_reader|attr_writer|attr_accessor|attr|catch|throw|private|private_class_method|module_function|public|public_class_method|protected|refine|using
Unfortunately it doesn't look like most of these are handled in any special way by the tree-sitter.
VS Code Ruby handles them in its TextMate file here:
Sorry I keep finding more nitpicks >.<
I tried messing around with the textmate file and added the keywords from the VS Code Ruby extension, but unfortunately the method matcher for tree-sitter overrides the TextMate grammar for a lot of them.
For example, protected
works fine, but attr_accessor
is caught by the tree-sitter method matcher:
What'd be the best way to fix that?