/adf_databricks_terraform

ADF, Databricks and Azure NativeLogging all together

Primary LanguageHCLMIT LicenseMIT

ADF and Databricks connectivity

The codebase creates an end-to-end ADF to databricks pipeline with fully integrated logging.

Tech Stack

  • Terraform
  • ARM
  • DataFactory
  • Application Insights
  • Log Analytics

Usage

terraform plan -var-file="../secret.tfvars" -out=plan.tfplan
terraform apply plan.tfplan

Not included secrets.tfvars - you need to supply your own.## Requirements

Requirements

Name Version
azurerm 3.30.0
databricks 1.6.2
external 2.2.2
local 2.2.3
random 3.4.3
template 2.2.0

Providers

Name Version
azurerm 3.30.0
databricks 1.6.2
local 2.2.3
random 3.4.3
template 2.2.0

Modules

Name Source Version
core_logging ./core_logging n/a
data_storage_external ./module_storage n/a
data_storage_internal ./module_storage n/a
databricks_logging ./databricks_logging n/a

Resources

Name Type
azurerm_application_insights_analytics_item.databricks_logging_query resource
azurerm_data_factory.adf resource
azurerm_data_factory_linked_service_data_lake_storage_gen2.external resource
azurerm_data_factory_linked_service_data_lake_storage_gen2.internal resource
azurerm_data_factory_pipeline.external_internal resource
azurerm_data_factory_trigger_schedule.external_internal resource
azurerm_databricks_workspace.databricks resource
azurerm_monitor_diagnostic_setting.adf resource
azurerm_role_assignment.adf_databricks resource
azurerm_role_assignment.external_adf resource
azurerm_role_assignment.internal_adf resource
azurerm_storage_container.destination resource
azurerm_storage_container.source resource
azurerm_template_deployment.dataset_data_lake_storage_gen2_destination resource
azurerm_template_deployment.dataset_data_lake_storage_gen2_source resource
azurerm_template_deployment.linked_databricks resource
databricks_cluster.my-cluster resource
databricks_notebook.adf_calling_databricks resource
databricks_notebook.logging_python resource
databricks_notebook.logging_scala resource
random_id.adf_random resource
azurerm_client_config.current data source
azurerm_monitor_diagnostic_categories.adf data source
azurerm_resource_group.logging data source
local_file.adf_calling_databricks data source
local_file.databricks_logging_query data source
local_file.dataset_data_lake_storage_gen2 data source
local_file.linked_databricks data source
local_file.logging_python data source
local_file.logging_scala data source
template_file.pipeline_external_internal data source

Inputs

Name Description Type Default Required
client_id n/a string "Invalid" no
client_secret n/a string "Invalid" no
instance n/a number 3 no
location Common resource group to target string "centralus" no
log_retention_days n/a number 30 no
prefix n/a string "demo" no
subscription_id n/a string "Invalid" no
suffix n/a string "logging" no
tenant_id n/a string "Invalid" no

Outputs

No outputs.