Question about scopes and selectors
Opened this issue · 13 comments
@gamecreature Hi Rick, hope this offtopic question won't bother you... Here's the thing, I'd like to implement my own version of Sublime's view.extract_scopes
and view.scope_name
methods in python but I'm a little bit lost.
So far I've got ready to go a python oniguruma regex engine as well as few dozens of tmLanguage.json files ready to be consumed.
That said, could you please guide/mentor me about what I'd need to implement to achieve this little goal? Asking you cos my knowledge about scoping is still pretty limited and even if I've rewritten some bits of your code I still don't know very well how to use it or what the big picture is hehe :D
Thx in advance!
I will try ;-)
Access alread parsed scopes:
In the edbee-app code all active scopes are displayed in the status bar:
edbee-lib/edbee-lib/edbee/texteditorcontroller.cpp
Lines 454 to 462 in d7a6c96
This is how you can access the parsed scopes.
Scope Parsing
When parsing a file, a the scopes are trying to match the regexps.
There are multi-line regexps (Begin and end) and single line regexps (match)
edbee-lib/edbee-lib/edbee/io/tmlanguageparser.cpp
Lines 96 to 149 in d7a6c96
When a multi-line regexps matches, a scope is opened and active.
It will stay in this scope until the end regexp matches
This way it builds a tree with scopes.
(See
edbee-lib/edbee-lib/edbee/lexers/grammartextlexer.cpp
Lines 459 to 470 in d7a6c96
Thank you very much, that helps! today I'll try to allocate some time to port that to the experimental pylime.
Btw... I already asked in some freenode irc channels like {python, pyside2, pyqt, sublimetext} for people's help to make something real out of that little experiment but so far nobody got interested. I guess that's pretty normal in any case, from what I've seen here in github people will just start contributing to very mature projects or they will just open PR requesting features they want :D
At this point I've learned all the basics about scopes&selectors but one thing still remains unclear to me... and that is basically how extract_scope
works in SublimeText is implemented, consider this data extracted from SublimeText using this command:
class TestScopeCommand(sublime_plugin.TextCommand):
def run(self, edit, block=False):
print('-' * 80)
view = self.view
for i in range(view.size()):
a = i
b = repr(view.substr(i))
c = view.extract_scope(i)
d = repr(view.substr(view.extract_scope(i)))
e = view.scope_name(i)
print("{:<5}{:<5}{:<10}{:<65}{}".format(a,b,c,d,e))
used on this test file foo.py
:
# I'm a comment
def foo():
print('# No comment')
if you apply that command on foo.py
you'll get the below data. where the 3 column are the ranges obtained by extract_scope
:
table = [
[0, '#', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python punctuation.definition.comment.python "],
[1, ' ', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[2, 'I', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[3, "'", (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[4, 'm', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[5, ' ', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[6, 'a', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[7, ' ', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[8, 'c', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[9, 'o', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[10, 'm', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[11, 'm', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[12, 'e', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[13, 'n', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[14, 't', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[15, '\n', (0, 16), "# I'm a comment\n", "source.python comment.line.number-sign.python "],
[16, '\n', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
[17, 'd', (17, 24), 'def foo', "source.python meta.function.python storage.type.function.python "],
[18, 'e', (17, 24), 'def foo', "source.python meta.function.python storage.type.function.python "],
[19, 'f', (17, 24), 'def foo', "source.python meta.function.python storage.type.function.python "],
[20, ' ', (17, 24), 'def foo', "source.python meta.function.python "],
[21, 'f', (17, 24), 'def foo', "source.python meta.function.python entity.name.function.python meta.generic-name.python "],
[22, 'o', (17, 24), 'def foo', "source.python meta.function.python entity.name.function.python meta.generic-name.python "],
[23, 'o', (17, 24), 'def foo', "source.python meta.function.python entity.name.function.python meta.generic-name.python "],
[24, '(', (24, 26), '()', "source.python meta.function.parameters.python punctuation.section.parameters.begin.python "],
[25, ')', (24, 26), '()', "source.python meta.function.parameters.python punctuation.section.parameters.end.python "],
[26, ':', (25, 27), '):', "source.python meta.function.python punctuation.section.function.begin.python "],
[27, '\n', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
[28, ' ', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
[29, ' ', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
[30, ' ', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
[31, ' ', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
[32, 'p', (32, 38), 'print(', "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
[33, 'r', (32, 38), 'print(', "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
[34, 'i', (32, 38), 'print(', "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
[35, 'n', (32, 38), 'print(', "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
[36, 't', (32, 38), 'print(', "source.python meta.function-call.python meta.qualified-name.python support.function.builtin.python "],
[37, '(', (37, 38), '(', "source.python meta.function-call.python punctuation.section.arguments.begin.python "],
[38, "'", (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python punctuation.definition.string.begin.python "],
[39, '#', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[40, ' ', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[41, 'N', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[42, 'o', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[43, ' ', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[44, 'c', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[45, 'o', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[46, 'm', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[47, 'm', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[48, 'e', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[49, 'n', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[50, 't', (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python "],
[51, "'", (38, 52), "'# No comment'", "source.python meta.function-call.python meta.function-call.arguments.python meta.string.python string.quoted.single.python punctuation.definition.string.end.python "],
[52, ')', (51, 53), "')", "source.python meta.function-call.python punctuation.section.arguments.end.python "],
[53, '\n', (0, 54), "# I'm a comment\n\ndef foo():\n print('# No comment')\n", "source.python "],
]
by looking at those ranges would you be able to infere what's the algorithm used by extract_scope
behind the curtains? I've asked already in the Sublime forums and nobody has been able to provide a positive answer so I'm asking to you as I'm aware you've got quite a deep knowledge about this subject... crossing fingers :)
Ps. I've tried to extract the asm code from this particular function with the debugger but I haven't been able to even find it... no clue if it's living in plugin_host.exe
or sublime_text.exe
... hehe ;)
Thanks in advance.
I don't know exactly what sublime uses..
But edbee searches for 'multi-line scopes' with a start and end regular expression.
It finds single-line scopes by applying the scopes in a given context. (regexps can be conditional to active scopes)
Wat sublime returns in the example above are just the multi-line scope ranges.... (column 3)
As you can see the first character of the comment is also a 'punctuation.definition.comment.python'.
The range for that scope is (0,0)
Mmm, not sure if I've understood correctly... But, do you mean the exact equivalent sublime extract_scope is what you call multi line scopes? What do you mean the range of that scope is (0,0)? In sublime pt 0 gives (0,16).
Anyway, the idea is using this https://github.com/brupelo/pysyntect to mimick sublime extract_scope. At this point i know how to get column 5 and i know how to apply selectors... Would that be enough to implement extract_scope? If you could give more details about how to implement it that would be great. Its the only missing function for me to have a 1:1 equivalent to sublime toggle comments on a qscintilla widget. As ive already reverse engineered the rest of st functions, ie: view.insert, view.erase and view.replace ;-)
Of course, once i've confirmed it works on a qscintilla we could adapt the code to edbee. Toggle comments is one of the most important features a text editor should have. And sublime works fantastically well
What I see is that extract_scope
returns the scope for every character (offset in the document).
It is not efficient, but you could fetch the scope at every character.. (pseude code)
foreach(offset in document) {
document->scopes()-> scopesAtOffset(offset);
}
I guess it's more efficient, to fetch all scopes and fill the characters yourself
[Ctrl + Shift + X, S] dumps the scopes (Mac Os X, [Command + Shift + X, S] )
The DebugCommand.cpp file contains the code which dumps the scopes
https://github.com/edbee/edbee-lib/blob/master/edbee-lib/edbee/commands/debugcommand.cpp
I don't know if this is the answer to your question.
The TextDocumentScopes class contains all scopes that have been parsed
Mmm, hehe, i guess my poor english and the fact I'm texting from phone makes difficult to understand. About the explanation of your previous comment. The name of the scope on each character is something i know already how to do, i've got that information (5th column) at my disposal. What i dont know how to compute is the 3rd column.
Extract_scope receives either a position or a range as input and returns a range as output. Range in this context is just a tuple of 2 integers, start and end offsets. Hopefully now my question makes more sense ;)
The range displayed there, seems to be the range of the last active multi-line scopes.
(In this example the 'comment.line.number-sign.python' scope. Which spans from (0, 15). (16 exclusive)
In edbee, the last scope of:
document->scopes()->multiLineScopedRangesBetweenOffsets(0,0)
Mmm, ia that so? Wow, cool... Maybe i should revaluate this project then... About the other feature of multiselection with the mouse... How hard do you think would be to have it ready? Asking cos maybe is worth implementing toggle comment directly on this project and not on a qscintilla...
I haven't implemented extract_scope in pyblime yet but theorically when I figure out how to do it I'll be able to use directly Sublime's existing code and it will work out of the box.
Btw, I've read somewhere in your blog few days ago you liked the demoscene... was one of the main reasons you created edbee so you could use it in some demotool/tool intended for 3d graphics? I'm quite curious about it actually :)
Let me tell you I'm a scener myself and one of the things I love the most is making demotools... creating 3d tools using python and c++ is a really enjoyable and fun experience... highly recommended :)
Awesome! Cool stuff ;)
Let me tell you I'm a total geek about demotools (specially the ones intended to create 4k/64k) and I know almost every existing tool in the scene... I'm curious about that texture/mesh editor, do you use nodes?
In my case, I'll also use pyblime as a drop-in widget replacement eventually for some of my tools ;)