Notebook execution started hanging in jupyter_client 8
gibsondan opened this issue · 5 comments
Hello,
Starting last week, a test in the Dagster papermill / jupyter integration started hanging. The hang coincided with a number of jupyter-related releases, but we were able to isolate the problem to the jupyter-client
8 release - downgrading to the previous release made the problem go away.
To reproduce the problem:
- Check out the dagster github repo at tag 1.1.13, with a python 3.8 venv
- Install tox
- In the
python_modules/libraries/dagstermill
folder, run: tox -vv -e py38 -- -k test_hello_world -x -s
The test will hang while executing a notebook, but the hang will go away if you force the previous version of jupyter-client to be installed instead.
Code in Dagster that executes the notebook: https://github.com/dagster-io/dagster/blob/master/python_modules/libraries/dagstermill/dagstermill/factory.py#L194-L213
Console output during the hang (this shows forever):
Executing: 50%|█████ | 1/2 [00:03<00:03, 3.18s/cell]
Stack trace of the notebook while execution is hanging:
Thread 0x10D504600 (idle): "MainThread"
select (selectors.py:558)
_run_once (asyncio/base_events.py:1823)
run_forever (asyncio/base_events.py:570)
run_until_complete (asyncio/base_events.py:603)
just_run (nbclient/util.py:57)
wrapped (nbclient/util.py:78)
execute (papermill/clientwrap.py:46)
execute_managed_notebook (dagstermill/engine.py:75)
execute_notebook (papermill/engines.py:343)
execute_notebook_with_engine (papermill/engines.py:49)
execute_notebook (papermill/execute.py:107)
_t_fn (dagstermill/asset_factory.py:86)
iterate_with_context (dagster/_utils/__init__.py:457)
_yield_compute_results (dagster/_core/execution/plan/compute.py:145)
execute_core_compute (dagster/_core/execution/plan/compute.py:177)
_step_output_error_checked_user_event_sequence (dagster/_core/execution/plan/execute_step.py:94)
core_dagster_event_sequence_for_step (dagster/_core/execution/plan/execute_step.py:382)
dagster_event_sequence_for_step (dagster/_core/execution/plan/execute_plan.py:265)
inner_plan_execution_iterator (dagster/_core/execution/plan/execute_plan.py:114)
__iter__ (dagster/_core/execution/api.py:1103)
execute (dagster/_core/executor/multiprocess.py:93)
_execute_command_in_child_process (dagster/_core/executor/child_process_executor.py:79)
run (multiprocessing/process.py:108)
_bootstrap (multiprocessing/process.py:315)
_main (multiprocessing/spawn.py:129)
spawn_main (multiprocessing/spawn.py:116)
<module> (<string>:1)
Let me know if any other information would be helpful.
I wonder if this is the same issue that we're seeing with nbclient+jupyter_client 8: jupyter/nbclient#272
Our CI also started failing, and we narrowed it down to jupyter_client 8 (we run some of our examples with papermill, which uses jupyter_client).
we didn't try downgrading to jupyter_client 7 since we already built an in-house notebook executor, and migrating to it fixed the issues.
Ah, this was a gap in testing for nbclient
, because we've been testing it against main
as a downstream project.
I'm closing this in favor of jupyter/nbclient#272.