jupyter/jupyter_client

Notebook execution started hanging in jupyter_client 8

gibsondan opened this issue · 5 comments

Hello,

Starting last week, a test in the Dagster papermill / jupyter integration started hanging. The hang coincided with a number of jupyter-related releases, but we were able to isolate the problem to the jupyter-client 8 release - downgrading to the previous release made the problem go away.

To reproduce the problem:

  • Check out the dagster github repo at tag 1.1.13, with a python 3.8 venv
  • Install tox
  • In the python_modules/libraries/dagstermill folder, run:
  • tox -vv -e py38 -- -k test_hello_world -x -s

The test will hang while executing a notebook, but the hang will go away if you force the previous version of jupyter-client to be installed instead.

Code in Dagster that executes the notebook: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagstermill/dagstermill/factory.py#L194-L213

Console output during the hang (this shows forever):

Executing:  50%|█████     | 1/2 [00:03<00:03,  3.18s/cell]

Stack trace of the notebook while execution is hanging:

Thread 0x10D504600 (idle): "MainThread"
    select (selectors.py:558)
    _run_once (asyncio/base_events.py:1823)
    run_forever (asyncio/base_events.py:570)
    run_until_complete (asyncio/base_events.py:603)
    just_run (nbclient/util.py:57)
    wrapped (nbclient/util.py:78)
    execute (papermill/clientwrap.py:46)
    execute_managed_notebook (dagstermill/engine.py:75)
    execute_notebook (papermill/engines.py:343)
    execute_notebook_with_engine (papermill/engines.py:49)
    execute_notebook (papermill/execute.py:107)
    _t_fn (dagstermill/asset_factory.py:86)
    iterate_with_context (dagster/_utils/__init__.py:457)
    _yield_compute_results (dagster/_core/execution/plan/compute.py:145)
    execute_core_compute (dagster/_core/execution/plan/compute.py:177)
    _step_output_error_checked_user_event_sequence (dagster/_core/execution/plan/execute_step.py:94)
    core_dagster_event_sequence_for_step (dagster/_core/execution/plan/execute_step.py:382)
    dagster_event_sequence_for_step (dagster/_core/execution/plan/execute_plan.py:265)
    inner_plan_execution_iterator (dagster/_core/execution/plan/execute_plan.py:114)
    __iter__ (dagster/_core/execution/api.py:1103)
    execute (dagster/_core/executor/multiprocess.py:93)
    _execute_command_in_child_process (dagster/_core/executor/child_process_executor.py:79)
    run (multiprocessing/process.py:108)
    _bootstrap (multiprocessing/process.py:315)
    _main (multiprocessing/spawn.py:129)
    spawn_main (multiprocessing/spawn.py:116)
    <module> (<string>:1)

Let me know if any other information would be helpful.

I wonder if this is the same issue that we're seeing with nbclient+jupyter_client 8: jupyter/nbclient#272

Our CI also started failing, and we narrowed it down to jupyter_client 8 (we run some of our examples with papermill, which uses jupyter_client).

we didn't try downgrading to jupyter_client 7 since we already built an in-house notebook executor, and migrating to it fixed the issues.

Ah, this was a gap in testing for nbclient, because we've been testing it against main as a downstream project.

I'm closing this in favor of jupyter/nbclient#272.

Actually the error is in this repo, I'm investigating in #925