bryckster

A simple Python CLI tool to convert notebooks from Jupyter to Databricks Notebooks format.

Usage

python -m venv .venv && source .venv/bin/activate && \
pip install --upgrade pip && pip install -r requirements.txt

python bryckster.py run \
    --input_path JUPYTER_NOTEBOOKS_FOLDER \
    --output_path DATABRICKS_NOTEBOOKS_FOLDER

This repo has been setup in a way that it showcases the following example:
- Jupyter notebooks are stored in the jupyter folder
- A CICD pipeline looks into the jupyter folder and converts all of them into Databricks format, outputs them into the databricks folder
- As a last step, input .ipynb files are removed
- To give it a try, you can fork this repo and add ipynb files into the Jupyter folder. Once you commit, Github Actions will run the steps above.

It is possible to import Jupyter Notebooks directly into a Databricks Workspace and have them converted automatically.
However, when using Repos integration the conversion does not work out of the box.
You can add bryckster as a step in your CICD pipeline, so that whenever someone pushes an .ipynb into your repo, it gets converted to the format expected by Databricks.
Apart from bryckster, it is also possible to import Jupyter Notebooks directly into a Databricks workspace by using the Workspace REST API or the databricks workspace CLI.
One common caveat with Jupyter Notebooks is that since they store the output, they can also get pretty big. With bryckster you can simply get rid of those (if needed) prior to pushing into a Git Repo, for cleanliness sake.