moinwiki/moin

Page trail: checking the existence of items slows down the response time for larger wikis

UlrichB22 opened this issue · 5 comments

I tested an imported wiki with 2400 items and about 12000 revisions. Each whoosh index query lasts about 120 ms.
At the start of my tests the browser history was empty, I just showed TestItem01, TestItem02, ... TestItem05 and again TestItem01.
The following server output was produced with additional clock timers:

DEBUG 2024-10-17 21:55:54,261 moin.utils.clock:48 timer total(0): 1564.11ms /TestItem01
DEBUG 2024-10-17 21:55:58,699 moin.utils.clock:48 timer total(0): 1602.45ms /TestItem02
DEBUG 2024-10-17 21:56:03,658 moin.utils.clock:48 timer total(0): 1795.20ms /TestItem03
DEBUG 2024-10-17 21:56:08,248 moin.utils.clock:48 timer total(0): 2165.31ms /TestItem04
DEBUG 2024-10-17 21:56:13,020 moin.utils.clock:48 timer total(0): 2321.03ms /TestItem05
DEBUG 2024-10-17 21:56:17,832 moin.utils.clock:48 timer total(0): 2376.22ms /TestItem01

With 5 entries in the page trail, the response time increased by 800 ms. Two additional index queries are used for each item.

The relevant parts in the code are:

breadcrumbs.append((fq_segment, fq_current, bool(self.storage.get_item(**fq_current.query))))

exists = bool(self.storage.get_item(**fqname.query))

IMO there are two solutions to regain performance:

  1. add the exists status while showing an item to the page trail. If an item is deleted in parallel by someone else the status may not be accurate.
  2. do not add non-existing items to the page trail and remove the checks.

When I try to open an item by entering a non-existing itemname in the browser URL the create dialog is show. At this moment the item is already added to the page trail even if I leave the dialog without adding the item. Not sure if this is useful.

120 ms seems like a long time just to see if an item exists. I keep thinking that somewhere in the processing below storage.get_item there is some code that opens the data file of every item in the page trail and includes it as part of the Item object.. But I have been unable to find it. It would be more comforting if the procedure name were storage.get_item_meta.

Agree that having non-existent items in the page trail is not very useful, and your solutions above.

Thanks, I will check the code in storage.get_item.

As far as I can see, the storage.get_item method does not read the data file.

Agree, storage.get_item does not read the data file nor open the file.

Prior performance improvements avoided opening a file that was never read but opened. As in creating an Item object with an open file when only meta data was needed.