Skip to main content

SRA Components Breakdown

This section outlines the core components of the Security Reference Architeture (SRA). Several .tf scripts contain direct links to the Databricks Terraform documention. The full Databricks Terraform Provider Documentation can be found here.

Core Azure Components

  • Vnet Injection: Vnet Injection allows Azure Databricks workspaces to be deployed directly into a customer-managed virtual network (VNet), providing control over network configuration to meet organizational security and governance requirements.
  • Private Endpoints: Leveraging Azure Private Link, private endpoints connect the customer’s VNet to Azure services without using public IP addresses, ensuring secure, private communication.
  • PrivateLink Connectivity: Private Link establishes private network paths between the customer’s data plane and the Databricks control plane, preventing traffic from traversing the public internet. This template configures Back-End Private Link for communication to the Databricks control plane from classic compute clusters.

Core Databricks Components

  • Unity Catalog: Unity Catalog is a unified governance solution for data and AI assets such as files, tables, and machine learning models. Unity Catalog enforces fine-grained access controls, centralized policy management, auditing, and lineage tracking—all integrated into the Databricks workflow.
  • Network Connectivity Configuration: Serverless network connectivity is managed with network connectivity configurations (NCC), which are account-level regional constructs that are used to manage private endpoints creation and firewall enablement at scale. An NCC is created and attached to the workspace, which contains a list of stable Azure service subnets, which will be used by the serverless compute in that workspace to connect the Azure resource using service endpoints.
  • Restrictive Network Policy: Network Policies implement egress controls for serverless compute by enforcing a restrictive network policy that permits outbound traffic only to required data buckets.

Adding Additional Spokes

To add additional spokes to this configuration, follow the steps below.

  1. Add a New Key to the spoke config Variable
# Terraform variables (for example, terraform.tfvars)
spoke_config = {
spoke = {
resource_suffix = "spoke"
cidr = "10.1.0.0/20"
tags = {
environment = "dev"
},
},
spoke_b = { # <----- Add a new spoke config
resource_suffix = "spoke_b"
cidr = "10.2.0.0/20"
tags = {
environment = "test"
}
}
}
  1. Add a New Provider in providers.tf for the New Spoke
# providers.tf

# New spoke provider
provider "databricks" {
alias = "spoke_b"
host = module.spoke_b.workspace_url
}
  1. Copy the spoke.tf File: Copy spoke.tf to a new file (for example, spoke_b.tf).
  2. Modify the Module Names and References:
# spoke_b.tf

module "spoke_b" { # <----- Modify the name of the module to something unique
source = "./modules/spoke"

# Update these per spoke
resource_suffix = var.spoke_config["spoke_b"].resource_suffix
vnet_cidr = var.spoke_config["spoke_b"].cidr
tags = var.spoke_config["spoke_b"].tags

depends_on = [module.hub]
}

module "spoke_b_catalog" {
source = "./modules/catalog"

# Update these per catalog for the catalog's spoke
catalog_name = module.spoke_b.resource_suffix
dns_zone_ids = [module.spoke_b.dns_zone_ids["dfs"]]
ncc_id = module.spoke_b.ncc_id
resource_group_name = module.spoke_b.resource_group_name
resource_suffix = module.spoke_b.resource_suffix
subnet_id = module.spoke_b.subnet_ids.privatelink
tags = module.spoke_b.tags

providers = {
databricks.workspace = databricks.spoke_b
}
}
  1. Apply the Configuration: terraform apply.

Security Analysis Tool (SAT)

The Security Analysis Tool (SAT) is enabled by default to continuously monitor the security posture of your Databricks environment. By default, SAT is installed in the hub workspace, also known as the WEB_AUTH workspace.

Changing the SAT Workspace

To deploy the Security Analysis Tool (SAT) in a different workspace, three modifications are required in customizations.tf:

  1. Update the Databricks provider in the SAT module:
# Default
providers = {
databricks = databricks.hub
}

# Modified
providers = {
databricks = databricks.spoke
}
  1. Update the local sat_workspace reference:
# Default
locals {
sat_workspace = module.hub
}

# Modified
locals {
sat_workspace = module.spoke
}
  1. Update the databricks_permission_assignment.sat_workspace_admin resource
# Default
resource "databricks_permission_assignment" "sat_workspace_admin" {
count = length(module.sat)
...
provider = databricks.hub
}

# Modified
resource "databricks_permission_assignment" "sat_workspace_admin" {
count = length(module.sat)
...
provider = databricks.spoke
}

NOTE: The Security Analysis Tool (SAT) is designed to be deployed once per Azure subscription. To deploy SAT in multiple regions, provision SAT in multiple spokes using the same modifications above.

SAT Service Principal

Some users may not have permissions to create Entra ID service principals. In this case, a pre-existing service principal can be used:

# example.tfvars
sat_service_principal = {
client_id = "00000000-0000-0000-0000-000000000000"
client_secret = "some-secret"
}

If no service principal is provided, the template creates one named spSAT by default. The name can be customized:

# example.tfvars
sat_service_principal = {
name = "spSATDev"
}

SAT Compute

The Security Analysis Tool (SAT) is installed using classic compute by default. This is because SAT does not yet support inspecting workspaces outside of the current workspace when running on serverless. If you would like to run on serverless compute instead, you can modify the sat_configuration variable to specify using serverless (see below):

sat_configuration = {
run_on_serverless = true
}

NOTE: When running the Security Analysis Tool (SAT) on serverless compute, SAT will only inspect the current workspace.