Proposal: syntax highlighting
torwart opened this issue · 11 comments
First things first I am not a pro at Rust and its only a proposal and nothing more.
Iota is a text Editor and I think every text editor needs syntax highlighting. I don't know if termbox/rustbox offers this, but it would be a great journey. And how to do this? Thats the point where I think #51 comes to rescue. Toml, Yaml or JSON, a simple Language Definition would be (yaml example):
# Go
extension: *.go
keywords: func type #...
Keywords would be readed as array, and will be registered with the colors given. And at opening a new file, Iota would check the file extension, if the file extension known, it will be set.
Also I think writing this everything in one file (languages.toml
or languages.iota
) would be nice so its a little bit more hackable.
What you think about this? I would help implementing, but first then I need to start learning Rust.
Good Evening.
Yep, syntax highlighting is definitely something I want to include in the future.
I'm not sure what is the best way to achieve this. I've certainly seen the approach you mention above, where we give a list of keywords, which map to specific file types and that's the language definition done, though I'm not sure if that's the best way to go about it.
I like the idea of allowing the user to completely customise the editor's behaviour for a specific language. I think this should include some syntax highlighting customisation. It would be cool if the user could define their own keywords which would be highlighted, or define extra patterns which would get treated differently.
To use a completely useless example: If the user wanted all import statements in python files to be highlighted in blue, I think they should be able to do that.
I think much of how this works will come down to how we implement the customisation layer. But I agree that Iota definitely needs something like this!
I think that can be done by doing something like this:
# general colors
keyword_color: # colors for keywords
# go
keywords: # keywords...
An implementation example would be:
colors.config -> iota (read, set variables to colors)
languages.config -> iota (read as array, match keywords with colors from variables)
iota -> screen
It looks easy, but I think it isn't. Would be a hard step, but lets look what we will get in #51.
Isn't plain keyword matching a little bit too simple? How to match dynamic type names in declarations etc? I guess one could go with regex.
How are vim / emacs and others implementing this feature? Maybe it would be a good idea to 'borrow' their implementations.
@SebastianKeller Yep, also Regex would be a nice thing. This is currently an example.
Found vis, something like an Vim clone. It uses regex:
https://github.com/martanne/vis/blob/master/syntax.h
https://github.com/martanne/vis/blob/master/config.def.h#L900-1269
I think there are perhaps two different "levels" of syntax highlighting: "pattern" and "semantic".
Pattern highlighting is the easiest (and I think most editors only have this), simply coloring whatever matches a given set of regexes.
Semantic highlighting is (in my experience) typically offered only by the larger IDEs, and depends on the existence of a parsed AST for the buffer in question. Once you have the AST it's pretty easy to do all sorts of fun stuff, like highlighting instances of a particular member variable, or generic parameters, or what have you. (In fact, now that I think about, with enough work, we might even be able to highlight specifiers in Rust's format strings... That can come later though. :D )
I think for now @torwart's approach makes a lot of sense: one language file maps regexes to categories, then the main config file maps categories to colors.
(If/when I (or anybody) finds time, it may be worth investigating possible applications of the TextObject material to this problem)
Having an AST for Rust should be doable, and awesome :) but first step would be having a generic way to provide color informations to text objects
For referennce, atom.io is using the textmate grammar and you can find a definition here:
http://manual.macromates.com/en/language_grammars.html
It provides everything you suggested plus additional ideas such as support for embedded syntax highlighting (highlight js inside html for instance).
You can find the rust grammar definition here to have a concrete sample:
Note that they're a few issue for instance impl are not particularly well defined.
IMO, even if it looks great and the definition of a grammar is relatively simple, using only regex for syntax highlighting is surely not sufficient for a precise syntax highlighter.
Consider function body. It's hard to get right an highlighting in function body. How would you differentiate a type from an enum? You need an AST for that. Not necessary the same AST as rustc use, but at least one that understand the use
/mod
semantics of rust.
However, defining a file format to support any language AST does not seems doable. You better need to offer a packet manager as atom do and let the community create packet for each language.
Hey all, it's been a while since I posted an update here, my apologies.
This week I started experimenting with syntax highlighting. I decided to implement a simple lexer which produces tokens from a given blob of text. I then use these tokens to determine how to render the corresponding text to the user.
The default lexer can be extended for each language, right now I've only added Python & Rust.
It is by no means complete, and I'm not even convinced yet if I want to keep it like this. But right now it seems to work pretty well. I'm going to keep extending it for a while and see how it handles all the use-cases I put to it. That will help me decide whether or not it works.
I'm also toying with the idea of using the rust lexer & token system (syntax::parse::lexer
), but I haven't tried it out yet.
If you want you can check it out on the syntax-highlight
branch. Happy to hear anyones thoughts on this!
Just merged the syntax-highlighting
branch. It's by no means a perfect solution, I think I'll eventually transition to a regex or AST based solution, but it's good enough for me right now.
Syntax highlighting is behind a feature-gate for now, until I find a better solution. You can build it with cargo build --features syntax-highlighting
.