Proposal: Jupyter should be able to handle large notebooks

Question

Proposal: Jupyter should be able to handle large notebooks

Opened this issue 3 years ago · 2 comments

Problem

We should add a benchmark test and make changes so that 2k cell notebooks feels good to work with. In practice, I have seen some users make notebook in the 1k-ish range, so 2k is an arbitrary number that is bigger than that (maybe it should be 10k?).

We'd first need to define "feels good to work with" a bit more, which I'll state as something like:

Allows a user to interact with it in no more than 10s seconds
Clicking on cells within the notebook become interactive as fast as a 10 cell notebook
Characters typed in code cells are rendered as fast as a 10 cell notebook
Switching tabs should be no more than 20% slower than with a 10 cell notebook
Scrolling/jumping to a cell (e.g. via ToC) should be interactive in less than 500ms
It does not significantly interfere with the rest of the page (e.g. button clicks takes no more than 20% more than if the notebook was not on the page)

All the numbers and metrics above are just somewhere to get started. Happy to put in other metrics and/or change any of the numbers as I choose them somewhat arbitrarily as well. That being said, today none of these metrics pass.

What is is like today?

Given this generated notebook (note, there is no output for any cell which makes this simpler than in the real world):

import json
import nbformat

NUM_CELLS = 2000

nb = nbformat.v4.new_notebook()
nb.metadata.kernelspec = {
    "display_name": "Python 3",
    "language": "python",
    "name": "python3",
}
for n in range(NUM_CELLS):
    nb.cells.append(nbformat.v4.new_code_cell("# cell {}".format(n + 1)))

with open(
    "generated-{}cells.ipynb".format(NUM_CELLS),
    "w",
) as f:
    f.write(json.dumps(nb, indent=4))

In lab 3.1 I am finding the following performance when I open the above notebook and try to use it:

Zooming in, all the work seems to be this codemirror pattern over and over again:

While we are working on things like jupyterlab/jupyterlab#10370 and jupyterlab/lumino#231 I thought it would be good to set a both a bit more defined goal and give everyone the same example to test against.

@blois, I'm curious how this notebook performs with the new colab virtualization (#68 (comment)).

What do others think?

CC those who have come to performance meetings as this size notebook was a topic of our first meeting. @fcollonval @sagemaster @echarles @Zsailer @jasongrout @afshin @ellisonbg @3coins @goanpeca

Answer 1 · 2021-10-27T22:50:34.000Z

@mlucool scrolling is definitely hitting some layout issues in Colab- the switch between virtualized and the real editor incurs some solid layout issues. But typing isn't terrible. https://colab.research.google.com/gist/blois/68f1dc5b50ea315de5071c978d0b3f35/generated-2000cells.ipynb

Answer 2 · 2021-12-19T13:30:16.000Z

Thanks for the example! Cross-referencing jupyterlab/jupyterlab#9757 - large notebooks can choke the UI due to overuse of layout reflows. It looks like the fix will need to land in lumino.