Page trail: checking the existence of items slows down the response time for larger wikis
UlrichB22 opened this issue · 5 comments
I tested an imported wiki with 2400 items and about 12000 revisions. Each whoosh index query lasts about 120 ms.
At the start of my tests the browser history was empty, I just showed TestItem01, TestItem02, ... TestItem05 and again TestItem01.
The following server output was produced with additional clock timers:
DEBUG 2024-10-17 21:55:54,261 moin.utils.clock:48 timer total(0): 1564.11ms /TestItem01
DEBUG 2024-10-17 21:55:58,699 moin.utils.clock:48 timer total(0): 1602.45ms /TestItem02
DEBUG 2024-10-17 21:56:03,658 moin.utils.clock:48 timer total(0): 1795.20ms /TestItem03
DEBUG 2024-10-17 21:56:08,248 moin.utils.clock:48 timer total(0): 2165.31ms /TestItem04
DEBUG 2024-10-17 21:56:13,020 moin.utils.clock:48 timer total(0): 2321.03ms /TestItem05
DEBUG 2024-10-17 21:56:17,832 moin.utils.clock:48 timer total(0): 2376.22ms /TestItem01
With 5 entries in the page trail, the response time increased by 800 ms. Two additional index queries are used for each item.
The relevant parts in the code are:
moin/src/moin/themes/__init__.py
Line 352 in 807bfe1
moin/src/moin/themes/__init__.py
Line 373 in 807bfe1
IMO there are two solutions to regain performance:
- add the exists status while showing an item to the page trail. If an item is deleted in parallel by someone else the status may not be accurate.
- do not add non-existing items to the page trail and remove the checks.
When I try to open an item by entering a non-existing itemname in the browser URL the create dialog is show. At this moment the item is already added to the page trail even if I leave the dialog without adding the item. Not sure if this is useful.
120 ms seems like a long time just to see if an item exists. I keep thinking that somewhere in the processing below storage.get_item there is some code that opens the data file of every item in the page trail and includes it as part of the Item object.. But I have been unable to find it. It would be more comforting if the procedure name were storage.get_item_meta.
Agree that having non-existent items in the page trail is not very useful, and your solutions above.
Thanks, I will check the code in storage.get_item.
As far as I can see, the storage.get_item method does not read the data file.
Agree, storage.get_item does not read the data file nor open the file.
Prior performance improvements avoided opening a file that was never read but opened. As in creating an Item object with an open file when only meta data was needed.