rdbende/chlorophyll

Serious performance issues

rdbende opened this issue · 14 comments

@Moosems:

As mentioned previously the CodeView widget can be quite slow and upon further testing there are some serious performance issues to be addressed. The code below makes a very slow CodeView widget after inserting 100 long lines of highlightable chars. To truly understand how slow it goes you must scroll around. Test code used:

from tkinter import Tk, Text, Scrollbar
from chlorophyll import CodeView
import time
time_reg = []
time_CodeView = []
def perform_test(msg, master, text_widget):
    start_time = time.time()
    for i in range(len((current_text := text_widget.get("1.0", "end").splitlines()))):
        for j in range(len(current_text[i])): text_widget.see(f"{i+1}.{j}")
    for i in range(1000): master.geometry(f"{i+1}x{i+1}")
    end_time = time.time()
    print(f"Time taken for {msg} to scroll down and resize: {(total_time := end_time-start_time)}")
    if msg == "Text": time_reg.append(total_time)
    else: time_CodeView.append(total_time)
    master.destroy()

def testRegText():
    root = Tk()
    text = Text(root, wrap="none")
    text.grid(row=0, column=0)
    xscrollbar = Scrollbar(root, orient="horizontal", command=text.xview)
    xscrollbar.grid(row=1, column=0, sticky="ew")
    text.config(xscrollcommand=xscrollbar.set)
    yscrollbar = Scrollbar(root, orient="vertical", command=text.yview)
    yscrollbar.grid(row=0, column=1, sticky="ns")
    text.config(yscrollcommand=yscrollbar.set)
    text.insert("1.0", "| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |\n"*200)
    root.after(1000, lambda: perform_test("Text", root, text))
    root.mainloop()

def testCodeView():
    root = Tk()
    text = CodeView(root, wrap="none")
    text.pack()
    text.insert("1.0", "| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |\n"*200)
    root.after(1000, lambda: perform_test("CodeView", root, text))
    root.mainloop()

testRegText()
testCodeView()
test_count = 5
for i in range(test_count):
    print(f"Test {i+1} of {test_count}")
    testRegText()
    testCodeView()

print("Average time taken for Text to scroll down and resize:", sum(time_reg)/len(time_reg))
print("Average time taken for CodeView to scroll down and resize:", sum(time_CodeView)/len(time_CodeView))

Output:

Time taken for Text to scroll down and resize: 3.411916971206665
Time taken for CodeView to scroll down and resize: 82.07269406318665
Test 1 of 5
Time taken for Text to scroll down and resize: 3.085193157196045
Time taken for CodeView to scroll down and resize: 82.81531071662903
Test 2 of 5
Time taken for Text to scroll down and resize: 3.0634210109710693
Time taken for CodeView to scroll down and resize: 81.8645350933075
Test 3 of 5
Time taken for Text to scroll down and resize: 3.0675089359283447
Time taken for CodeView to scroll down and resize: 82.0897867679596
Test 4 of 5
Time taken for Text to scroll down and resize: 3.0569241046905518
Time taken for CodeView to scroll down and resize: 81.81994986534119
Test 5 of 5
Time taken for Text to scroll down and resize: 3.06306791305542
Time taken for CodeView to scroll down and resize: 81.95325803756714
Average time taken for Text to scroll down and resize: 3.124672015508016
Average time taken for CodeView to scroll down and resize: 82.132455301284794

@rdbende:

Apparently, if you don't have a space between the |-s, they're about the same in time. Hmmm...

@Moosems:

Hmmm

Changing it out with other chars does change speed though. I think that this is something important to look into

Far better results for me:

Time taken for Text to scroll down and resize: 0.5080173015594482
Time taken for CodeView to scroll down and resize: 9.920884847640991
Test 1 of 5
Time taken for Text to scroll down and resize: 0.5013024806976318
Time taken for CodeView to scroll down and resize: 9.0430326461792
Test 2 of 5
Time taken for Text to scroll down and resize: 0.5166349411010742
Time taken for CodeView to scroll down and resize: 8.503015518188477
Test 3 of 5
Time taken for Text to scroll down and resize: 0.4846668243408203
Time taken for CodeView to scroll down and resize: 8.886438608169556
Test 4 of 5
Time taken for Text to scroll down and resize: 0.5025997161865234
Time taken for CodeView to scroll down and resize: 9.270618677139282
Test 5 of 5
Time taken for Text to scroll down and resize: 0.5011169910430908
Time taken for CodeView to scroll down and resize: 8.964476108551025
Average time taken for Text to scroll down and resize: 0.5023897091547648
Average time taken for CodeView to scroll down and resize: 9.098077734311422

I'm particularly interested in how the changing of chars changes speed. I doubt it's anything to do with tabs or graphics but likely more due to OS.

If I set the lexer in CodeView to be TextLexer, then they produce about the same results.
The problem is that chlorophyll highlights every | character as a Python pipe operator, and thus there are 19000 tags produced. The text tags in Tk are very slow.

I'm particularly interested in how the changing of chars changes speed. I doubt it's anything to do with tabs or graphics but likely more due to OS.

It depends on their meaning in the language the CodeView uses. The bar in this case means an operator in the default language (Python) of CodeView

Which is why I also proposed a while back the view only tags

But I'm still not sure whether constantly removing and re-adding the tags doesn't introduce another performance issues.

I think that may be something for me to test and if it doesn't yield any successes then it can be ditched

But I'm still not sure whether constantly removing and re-adding the tags doesn't introduce another performance issues.

Well it would only be one line at a time and even then you can only fit so many tags into one line so unless it's s VERY dense line it should be faster. And if it is very dense, wouldn't it be faster only to highlight those in view? The only issue is if you want to make a peer widget later for a minimap, it wouldn't have the highlighting. It would make sense to only highlight what's in view for it though and you're likely not reading the code from the minimap or checking the highlighting from there and it also makes the viewable area pop out more. @rdbende Would we like to give it a try? If it's slower it can be removed. It might also fix the docstring issue in #10 by getting the tokens for all lines and only highlighting stuff in view.

Yes, we could give it a try, but even though I know that the slowness kinda annoying, this isn't a priority for me right now.

Yeah I get it. I kinda assume that if I'm making anything for chlorophyll or tklinenums that I'm on my own and any help I get is to be greatly appreciative of.

that I'm on my own and any help I get is to be greatly appreciative of.

Yeah, sorry bout that

No no no, don't apologize. You've been super helpful with everything and I'm really glad you've spent time to work with me on this. It means a lot.

Hello, I have modified the highlight area function in codeview.py, making it smaller and improving performance a little. I have tested it in a python file with 1500 lines of code.

def highlight_area(self, start_line: int | None = None, end_line: int | None = None) -> None:
        if start_line is None or end_line is None:
            return

        # Remove all existing syntax tags
        for tag in self.tag_names(index=None):
            if tag.startswith("Token"):
                self.tag_remove(tag, "1.0", "end")

        # Highlight all text again
        self.highlight_all()

Here is the time it took to insert and highlight all the text

Insert time: 0.10043692588806152

I'm not sure why this would be any faster than the existing solution. Your code is basically just a slower wrapper for the highlight_all method, as it removes all relevant tags, and then calls highlight_all which will remove every tag and re-highlight everything.