This repository is for this blog post on CodeMinusTears.com:
https://codeminustears.com/2019/03/23/real-time-alerting-databricks/
It showcasing how you can use a Structured Streaming Pipeline in Azure Databricks to route anomalous data or alert you to various conditions or issues that you set up monitoring for by using an Azure Event Hub and a Azure Logic App.
The end result is a pipleine that filters out anomalous data in a real-time stream of data and sends a email with the anomalous data.
Name | Purpose |
---|---|
senderApp\sender.py |
Python App that emulates a stream of data |
InstallAzureEventHubsSparkConnector.docx |
Word Document that explains how to install the Micrsoft Event Hub Spark Connector |
SetUpLogicAppForAlerting.docx |
Word Document that explains how to set up for the Logic App for this demo |
Real-Time Alerting.ipynb |
IPYNB file for the demo and for import into Databricks |
Real-Time Alerting.py |
Source File for the Databricks Notebook for this demo |
There is a thorough walkthrough and explanation of how to set up this solution that can be found in this link: https://codeminustears.com/2019/03/23/real-time-alerting-databricks/
- Clone this repository to your local Machine
- Deploy an Azure Event Hub Namespace and create two event hubs within the namespace called ingestion and alerting
- Deploy an Azure Databricks workspace
- Import the notebook to your Databricks workspace
- Fill in the Event Hub Configuration details
- Run the notebook on the cluster
- Go to the sender.py file and fill in the Event Hub configuration details
- Run the python file using
python sender.py
in a Command Prompt or Bash Terminal - Deploy an Azure Logic App
- Follow the instructions in the SetUpLogicAppForAlerting.docx
- See emails pop into your inbox (could generate a lot of emails, create a filter rule!)