Skip to main content

Getting Started

SRA Installation and Deployment Steps

Follow the steps below to deploy the Security Reference Architecture (SRA) using Terraform:

  1. Clone the SRA repository.
  2. Install Terraform.
  3. Navigate to the aws -> tf folder and open up the template.tfvars.example file.
    • Fill in the required values for all of the variables and relevant feature flags that are required for the deployment.
    • NOTE: If using custom mode for the network configuration, do not uncomment the variables for the isolated configuration. Simply uncomment the variables for the custom configuration and populate with the respective values for each.
    • Rename the file to terraform.tfvars.
  4. Navigate to the provider.tf file and configure the AWS and Databricks Terraform provider authentication.
  5. From the terminal, ensure you are in the correct working directory for the tf folder.
  6. Run terraform init.
  7. Run terraform validate.
  8. Run terraform plan.
  9. Run terraform apply.

Critical Next Steps

The following steps outline essential security configurations that should be implemented after the initial deployment to further harden and operationalize the Databricks environment.

  • Implement a Front-End Mitigation Strategy:
    • IP Access Lists: Terraform configurations for enabling IP access lists are available in the customizations folder.
    • Front-End PrivateLink: Establishes a private connection to the Databricks web application over the AWS backbone, preventing exposure to the public internet. Read the documentation here.
  • Identity & Access Management:
    • Configure Single-Sign On and Multi-Factor Authentication: Enterprise deployments should implement SSO and MFA for secure authentication and identity management.
    • Setup SCIM (System for Cross-domain Identity Management) Provisioning: For automated user and group provisioning, integrate SCIM through the Databricks account console.

Additional Security Recommendations

The following recommendations help maintain a strong security posture across Databricks deployments. Some of these configurations extend beyond the SRA Terraform implementation and may require customer-specific setup (e.g., SCIM, SSO, or Front-End PrivateLink).

  • Segment Workspaces for Data Separation: Use distinct workspaces for different teams or functions (e.g., security, marketing) to enforce data access boundaries and reduce risk exposure.
  • Avoid Storing Production Datasets in Databricks File Store (DBFS): The DBFS root is accessible to all users in a workspace. Use external storage locations for production data and databases to ensure proper access control and auditing.
  • Back Up Assets from the Databricks Control Plane: Regularly export and back up notebooks, jobs, and configurations using tools such as the Databricks Terraform Exporter.
  • Regularly Restart Classic Compute Clusters: Restart clusters periodically to ensure the latest compute images and security patches are applied.
  • Integrate CI/CD and Code Management: Evaluate workflow needs for Git-based version control and CI/CD automation. Incorporate code scanning, permission enforcement, and secret detection to enhance governance and operational efficiency.
  • Deploy and Run the Security Analyis Tool (SAT): SAT analyzes your Databricks account and workspace configurations, providing recommendations to help you follow Databricks' security best practices.