Getting Started
SRA Installation and Deployment Steps
Follow the steps below to deploy the Security Reference Architecture (SRA) using Terraform:
- Clone the SRA repository.
- Install Terraform.
- Navigate to the
aws
->tf
folder and open up thetemplate.tfvars.example
file.- Fill in the required values for all of the variables and relevant feature flags that are required for the deployment.
- NOTE: If using
custom
mode for the network configuration, do not uncomment the variables for theisolated
configuration. Simply uncomment the variables for thecustom
configuration and populate with the respective values for each. - Rename the file to
terraform.tfvars
.
- Navigate to the
provider.tf
file and configure the AWS and Databricks Terraform provider authentication. - From the terminal, ensure you are in the correct working directory for the
tf
folder. - Run
terraform init
. - Run
terraform validate
. - Run
terraform plan
. - Run
terraform apply
.
Critical Next Steps
The following steps outline essential security configurations that should be implemented after the initial deployment to further harden and operationalize the Databricks environment.
- Implement a Front-End Mitigation Strategy:
- IP Access Lists: Terraform configurations for enabling IP access lists are available in the
customizations
folder. - Front-End PrivateLink: Establishes a private connection to the Databricks web application over the AWS backbone, preventing exposure to the public internet. Read the documentation here.
- IP Access Lists: Terraform configurations for enabling IP access lists are available in the
- Identity & Access Management:
- Configure Single-Sign On and Multi-Factor Authentication: Enterprise deployments should implement SSO and MFA for secure authentication and identity management.
- Setup SCIM (System for Cross-domain Identity Management) Provisioning: For automated user and group provisioning, integrate SCIM through the Databricks account console.
Additional Security Recommendations
The following recommendations help maintain a strong security posture across Databricks deployments. Some of these configurations extend beyond the SRA Terraform implementation and may require customer-specific setup (e.g., SCIM, SSO, or Front-End PrivateLink).
- Segment Workspaces for Data Separation: Use distinct workspaces for different teams or functions (e.g., security, marketing) to enforce data access boundaries and reduce risk exposure.
- Avoid Storing Production Datasets in Databricks File Store (DBFS): The DBFS root is accessible to all users in a workspace. Use external storage locations for production data and databases to ensure proper access control and auditing.
- Back Up Assets from the Databricks Control Plane: Regularly export and back up notebooks, jobs, and configurations using tools such as the Databricks Terraform Exporter.
- Regularly Restart Classic Compute Clusters: Restart clusters periodically to ensure the latest compute images and security patches are applied.
- Integrate CI/CD and Code Management: Evaluate workflow needs for Git-based version control and CI/CD automation. Incorporate code scanning, permission enforcement, and secret detection to enhance governance and operational efficiency.
- Deploy and Run the Security Analyis Tool (SAT): SAT analyzes your Databricks account and workspace configurations, providing recommendations to help you follow Databricks' security best practices.