/jupydiff

Quickly compare changes made to Jupyter notebooks in GitHub repositories with jupydiff!

Primary LanguageJupyter NotebookMIT LicenseMIT


jupydiff Docker Action for GitHub Actions

jupydiff is a GitHub Action that allows data scientists to quickly compare changes made to Jupyter notebooks in GitHub repositories. It works with regular commits and pull requests. When a change is made, jupydiff computes the code additions and deletions within each notebook, and summarises these as a comment on the associated commit or pull request.

Table of contents

Why use jupydiff

jupydiff helps you streamline data science code reviews. Without jupydiff, to compute the exact code difference between two Jupyter notebooks a reviewer would need to clone the repository, download and install nbdime and then run nbdime diff on the command line. Alternatively, observing the regular diff in a code editor, version control tool or on GitHub itself involved interpreting lines of underlying JSON Jupyter notebook structure.

How to use jupydiff

jupydiff is a GitHub action. This means it can be used in any GitHub repostitory (public or private) by configuring a GitHub Action workflow in your repository.

See below for an example workflow that runs jupydiff on any commit made to the repository. Comments on commits will be automatically added to relevant pull requests.

name: jupydiff

on: [ push ]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
        with:
          fetch-depth: 2
      - uses: Faethm-ai/jupydiff@v1

To configure jupydiff in your repostiory, add a the YAML above to a file at .github/workflows/jupydiff.yml.

Important notes:

  • The GitHub checkout action (action/checkout) is required to populate the GitHub Action runner with the repostiory contents. fetch-depth: 2 fetches the current and previous commits which are required for nbdime to compute the diff.
  • All of jupydiff's magic is contained in a single Docker action, which is added as a step by referencing Faethm-ai/jupydiff@v1 in your workflow.
  • When a push is made or a pull request is opened in your repository, this workflow will leave a comment with any differences in the Jupyter notebooks. This may take up to 2 minutes.

Dependencies

jupydiff is based on nbdime and builds on the official Python Docker image. It also requires nodejs and the official GitHub npm modules to leave a comment.

License

jupydiff is licensed under the MIT license.

Contributors

jupydiff has been developed and used by data scientists and engineers at Faethm AI. The original project was authored by Mikhail Thornhill and Blair Hudson.