/eda-devx-techsummit-fy23

Primary LanguagePythonApache License 2.0Apache-2.0

Tech Summit FY23: Developer Experience and EDA

This repository can be used to demonstrate the first party development experience on Databricks. It focuses on exploratory data analysis using bamboolib and how to incorporate software engineering best practices with Databricks notebooks. The majority of this repo is structured as a companion for the example article "Software engineering best practices for Databricks notebooks" (AWS | Azure | GCP).

Going through these examples, you will:

  • Learn more about the building blocks of bamboolib - IPyWidgets
  • Learn how to use bamboolib to explore, clean, and visualize raw data
  • Extend the base functionality of bamboolib with plugins

  • Add notebooks to Databricks Repos for version control.
  • Extracts portions of code from one of the notebooks into a shareable component.
  • Test the shared code.
  • Automatically run notebooks in git on a schedule using a Databricks job.
  • Optionally, apply CI/CD to the notebooks and the shared code.

The example is hands-on. We recommend working it step-by-step to learn how to apply these techniques to your own Databricks notebooks.