This module creates and configures service principals with appropriate permissions and entitlements to run CI/CD for a project, and creates a workspace directory as a container for project-specific resources for the Azure Databricks staging and prod workspaces. It also creates the relevant Azure Active Directory (AAD) applications for the service principals.
HCLApache-2.0
MLOps Azure Project Module with Service Principal Creation
In both of the specified staging and prod workspaces, this module:
Creates an AAD application and associates it with a newly created Azure Databricks service principal, configuring appropriate permissions and entitlements to run CI/CD for a project.
Creates a workspace directory as a container for project-specific resources
The service principals are granted CAN_MANAGE permissions on the created workspace directories.
NOTE:
This module is in preview so it is still experimental and subject to change. Feedback is welcome!
The Databricks providers that are passed into the module must be configured with workspace admin permissions.
The Azure Active Directory (AzureAD) provider that is passed into the module must be configured with Application.ReadWrite.All permissions to allow AAD application creation to link to an Azure Databricks service principal. This provider can be authenticated via an AAD service principal with the Application.ReadWrite.All permission.
The module assumes that one of the two Azure Infrastructure Modules (with Creation or Linking) has already been applied, namely that service principal groups with token usage permissions have been created with the default name "mlops-service-principals" or by specifying the service_principal_group_name field.
The service principal AAD tokens are short-lived (<60 minutes in most cases). If a long-lived token is desired, the AAD token can be used to authenticate into a Databricks provider and provision a personal access token (PAT) for the service principal.
Usage
provider"databricks" {
alias="staging"# Authenticate using preferred method as described in Databricks provider
}
provider"databricks" {
alias="prod"# Authenticate using preferred method as described in Databricks provider
}
provider"azuread" {} # Authenticate using preferred method as described in AzureAD providermodule"mlops_azure_project_with_sp_creation" {
source="databricks/mlops-azure-project-with-sp-creation/databricks"providers={
databricks.staging = databricks.staging
databricks.prod = databricks.prod
azuread = azuread
}
service_principal_name="example-name"project_directory_path="/dir-name"azure_tenant_id="a1b2c3d4-e5f6-g7h8-i9j0-k9l8m7n6o5p4"
}
Usage example with Git credentials for service principal
This can be helpful for common use cases such as Git authorization for Remote Git Jobs.
data"databricks_current_user""staging_user" {
provider=databricks.staging
}
data"databricks_current_user""prod_user" {
provider=databricks.prod
}
provider"databricks" {
alias="staging_sp"host=data.databricks_current_user.staging_user.workspace_urltoken=module.mlops_azure_project_with_sp_creation.staging_service_principal_aad_token
}
provider"databricks" {
alias="prod_sp"host=data.databricks_current_user.prod_user.workspace_urltoken=module.mlops_azure_project_with_sp_creation.prod_service_principal_aad_token
}
resource"databricks_git_credential""staging_git" {
provider=databricks.staging_spgit_username=var.git_usernamegit_provider=var.git_providerpersonal_access_token=var.git_token# This should be configured with `repo` scope for Databricks Repos.
}
resource"databricks_git_credential""prod_git" {
provider=databricks.prod_spgit_username=var.git_usernamegit_provider=var.git_providerpersonal_access_token=var.git_token# This should be configured with `repo` scope for Databricks Repos.
}
provider"databricks" {
alias="dev"# Authenticate using preferred method as described in Databricks provider
}
provider"databricks" {
alias="staging"# Authenticate using preferred method as described in Databricks provider
}
provider"databricks" {
alias="prod"# Authenticate using preferred method as described in Databricks provider
}
provider"azuread" {} # Authenticate using preferred method as described in AzureAD providermodule"mlops_azure_infrastructure_with_sp_creation" {
source="databricks/mlops-azure-infrastructure-with-sp-creation/databricks"providers={
databricks.dev = databricks.dev
databricks.staging = databricks.staging
databricks.prod = databricks.prod
azuread = azuread
}
staging_workspace_id="123456789"prod_workspace_id="987654321"azure_tenant_id="a1b2c3d4-e5f6-g7h8-i9j0-k9l8m7n6o5p4"additional_token_usage_groups=["users"] # This field is optional.
}
module"mlops_azure_project_with_sp_creation" {
source="databricks/mlops-azure-project-with-sp-creation/databricks"providers={
databricks.staging = databricks.staging
databricks.prod = databricks.prod
azuread = azuread
}
service_principal_name="example-name"project_directory_path="/dir-name"azure_tenant_id="a1b2c3d4-e5f6-g7h8-i9j0-k9l8m7n6o5p4"service_principal_group_name=module.mlops_azure_infrastructure_with_sp_creation.service_principal_group_name# The above field is optional, especially since in this case service_principal_group_name will be mlops-service-principals either way, # but this also serves to create an implicit dependency. Can also be replaced with the following line to create an explicit dependency:# depends_on = [module.mlops_azure_infrastructure_with_sp_creation]
}
Path/Name of Azure Databricks workspace directory to be created for the project. NOTE: The parent directories in the path must already be created.
string
N/A
yes
azure_tenant_id
The Azure tenant ID of the AAD subscription. Must match the one used for the AzureAD Provider.
string
N/A
yes
service_principal_group_name
The name of the service principal group in the staging and prod workspace. The created service principals will be added to this group.
string
"mlops-service-principals"
no
Outputs
Name
Description
Type
Sensitive
project_directory_path
Path/Name of Azure Databricks workspace directory created for the project.
string
no
staging_service_principal_application_id
Application ID of the created Azure Databricks service principal in the staging workspace. Identical to the Azure client ID of the created AAD application associated with the service principal.
string
no
staging_service_principal_aad_token
Sensitive AAD token value of the created Azure Databricks service principal in the staging workspace.
string
yes
staging_service_principal_client_secret
Sensitive AAD client secret of the created AAD application associated with the staging service principal. NOTE: Client secret is created with a default lifetime of 2 years.
string
yes
prod_service_principal_application_id
Application ID of the created Azure Databricks service principal in the prod workspace. Identical to the Azure client ID of the created AAD application associated with the service principal.
string
no
prod_service_principal_aad_token
Sensitive AAD token value of the created Azure Databricks service principal in the prod workspace.
string
yes
prod_service_principal_client_secret
Sensitive AAD client secret of the created AAD application associated with the prod service principal. NOTE: Client secret is created with a default lifetime of 2 years.
string
yes
Providers
Name
Authentication
Use
databricks.staging
Provided by the user.
Create group, directory, and service principal module in the staging workspace.
databricks.prod
Provided by the user.
Create group, directory, and service principal module in the prod workspace.