This repo contains the code as demonstrated in the session: Implementing a Data Quality Framework in Purview.
- A Purview Account
- An instance of Spark running Python, such as Databricks or Synapse Spark Pools. Alternatively, some of this can be executed solely within Python - allowing for adaptation for non-Spark environments
- An Azure Key Vault, or secret store, containing all the secrets and credentials used to connect to the Purview Account
The example notebooks are contained within the folder Notebooks
. They should be executed in order from 0 through to 3, in order to emulate the demo.
In due course, there will be a full python library which will extend upon the pyapacheatlas library - to manipulate and abstract the API calls for the purposes of Data Quality Extraction.