Jupyterlab adds jupytext metadata, but then `jupytext --sync` removes it again
Closed this issue · 9 comments
Use case
My team's use case: we edit .ipynb
notebooks in Jupyterlab. We use jupytext to create companion .py
files for code review.
Historically, we've relied on everyone installing jupytext locally to update the .py
files after notebook edits. Now, we'd like to add jupytext --sync
via pre-commit to ensure the files are kept in sync.
Issue
- With jupytext installed locally, edits to the notebook via Jupyterlab also automatically update the companion
.py
file. jupyterlab/jupytext also adds ajupytext.text_representation
dict to the notebook'smetadata
. - When
jupytext --sync
runs, it removesmetadata.jupytext.text_representation
from the notebook file again. - This causes problems when running
jupytext
via pre-commit: it updates the .ipynb notebook file to remove the extra jupytext metadata and then fails because the index is outdated:
$ git commit -a -m 'updated notebook'
jupytext.................................................................Failed
- hook id: jupytext
- exit code: 2
- files were modified by this hook
[jupytext] Reading test_notebook.ipynb in format ipynb
[jupytext] Loading test_notebook.py
[jupytext] Updating test_notebook.ipynb
[jupytext] Error: the git index is outdated.
Please add the paired notebook with:
git add test_notebook.ipynb
[jupytext] Updating the timestamp of test_notebook.py
[jupytext] Reading test_notebook.py in format py
[jupytext] Loading test_notebook.ipynb
[jupytext] Error: the git index is outdated.
Please add the paired notebook with:
git add test_notebook.ipynb
[jupytext] Updating the timestamp of test_notebook.py
Rerunning git commit -a
works, since we add jupytext's change to the git index. However, this isn't ideal because it's confusing, and it's annoying to have to do this every time we edit notebooks.
Details
I'm wondering why jupytext is adding this metadata in one context (editing via Jupyterlab) but removing it in another (called directly with jupytext --sync
). I believe this is the code that removes the metadata:
Line 461 in 57d83aa
It also seems that this only happens when using --sync
. If I use --from ipynb --to py
instead, then the original notebook isn't written, which avoids this issue. I'm using this as a workaround for now.
Versions
This is happening with recent versions:
$ python -V
Python 3.8.12
$ pip list | grep -i jup
jupyter-client 7.1.0
jupyter-core 4.9.1
jupyter-server 1.13.1
jupyterlab 3.2.5
jupyterlab-pygments 0.1.2
jupyterlab-server 2.10.2
jupytext 1.13.5
Hi @jli , thank you for reporting this. Well that is interesting! May I ask you to run this Python command?
from jupytext.config import find_jupytext_configuration_file
find_jupytext_configuration_file('.')
in the folder where you have the notebook? If any configuration file is found, can you share its content?
Note: It may happen that Jupytext CLI and Jupyter Lab are using different configurations, as Jupyter Lab cannot access files above the Jupyter root.
@mwouts Thanks for the quick response! Yep, I do have a jupytext.toml
config file. It looks like:
formats = "ipynb,py:percent"
(With further experimentation, I've found that without this config file, jupyterlab won't automatically create the companion .py
file when I create new notebooks. But with the config, jupyterlab does automatically create the companion file.)
I've also put together a small demo repo for this issue: https://github.com/jli/jupytext_sync_900
It has jupytext.toml
, requirements.txt
, and (optional) .pre-commit-config.yaml
files which I'm able to use to reproduce this issue, along with a README.md with specific reproduction steps.
Thanks!
Thank you @jli for the detailed information! I will make sure I can reproduce this, and then we will fix the issue. I'll keep you posted.
Hi again @jli , I am working on a test at https://github.com/mwouts/jupytext/compare/new_test_for_pre_commit_sync, but at the moment I have not been able yet to reproduce the issue.
The test looks like what you reported by maybe I have missed something. Do you think you could evolve that test into something that better reproduces the problematic flow? Thanks!
I believe the difference between that test and my situation is that when I modify the notebook file, something is adding the jupytext.text_representation
to the notebook metadata.
This is what's happening in my case:
- start: clean working tree, ipynb and py files in sync, no
jupytext
in notebook metadata. - edit and save ipynb file with Jupyterlab + jupytext: both files are updated with the edit, but the notebook metadata also now has a
jupytext.text_representation
section. - run jupytext --sync: invoking jupytext via the CLI removes the newly added
jupytext
metadata.
So in your test after updating the notebook, this assert should be true in order to match my situation:
assert "text_representation" in tmpdir.join("test.ipynb").read()
I tried adding this, and the assert fails. So, the test isn't reproducing the part where editing the notebook adds the jupytext
metadata section.
Thanks @jli for the additional information. I'll add steps 1. and 2. to the test and hopefully we'll reproduce this problematic jupytext.text_representation
(which in theory belongs only to text files)
Indeed with the new version of the test at 8e2f3bb I can reproduce the issue that you documented above (thanks for pointing out at the exact pattern!)
# modify the ipynb file in Jupyter
# Reload the notebook
nb = cm.get("test.ipynb")['content']
nb.cells.append(new_markdown_cell("A new cell"))
cm.save(dict(type="notebook", content=nb), "test.ipynb")
# The text representation metadata is in the py file
assert "text_representation" in tmpdir.join("test.py").read()
# But not in the ipynb notebook
> assert "text_representation" not in tmpdir.join("test.ipynb").read()
E assert 'text_representation' not in ('{\n'\n ' "cells": [\n'\n ' {\n'\n ' "cell_type": "markdown",\n'\n ' "id": "cell-1",\n'\n ' "metadata": {},\n'\n ' "source": [\n'\n ' "A short notebook"\n'\n ' ]\n'\n ' },\n'\n ' {\n'\n ' "cell_type": "markdown",\n'\n ' "id": "cell-3",\n'\n ' "metadata": {},\n'\n ' "source": [\n'\n ' "A new cell"\n'\n ' ]\n'\n ' }\n'\n ' ],\n'\n ' "metadata": {\n'\n ' "jupytext": {\n'\n ' "text_representation": {\n'\n ' "extension": ".py",\n'\n ' "format_name": "percent",\n'\n ' "format_version": "1.3",\n'\n ' "jupytext_version": "1.13.5"\n'\n ' }\n'\n ' },\n'\n ' "kernelspec": {\n'\n ' "display_name": "Python 3",\n'\n ' "language": "python",\n'\n ' "name": "python_kernel"\n'\n ' }\n'\n ' },\n'\n ' "nbformat": 4,\n'\n ' "nbformat_minor": 5\n'\n '}\n')
E 'text_representation' is contained here:
Hi @jli , this should be fixed in the new version 1.13.6 that is being published on pypi. Would you mind giving it a try and report if the problem disappears? Thanks