Secure and Scalable Project Collaboration with Databricks Mutualization:
A Step-by-Step Guide
What is Databricks Mutualization?
Databricks Mutualization is the process of securing Azure Databricks (ADB) by placing it behind a Virtual Network (VNet) to ensure data security. It also enables the use of a single Databricks workspace for multiple projects.
Steps to Create a Mutualized Workspace
1. Initial Setup
Use a common Azure Databricks workspace that is behind a VNet.
Set up a Key Vault dedicated to Azure Databricks within the same VNet/Subnet as the ADB workspace.
2. Service Principal (SP) Configuration
For each project:
Create a Service Principal (SP) specific to the project.
Map the SP's Client ID to the required folders on the CDL (or other storage services) where access is needed.
3. Secrets Management
In the Key Vault, create two secrets for each project:
Client ID of the Service Principal.
Secret Key for the Service Principal.
Create a Secret Scope for the project in the Databricks workspace:
For Dev: https://<workspace-name>.azuredatabricks.net/?o=<org-id>#secrets/createScope
For UAT: https://<workspace-name>.azuredatabricks.net/?o=<org-id>#secrets/createScope
For PRD: https://<workspace-name>.azuredatabricks.net/?o=<org-id>#secrets/createScope
4. User and Group Management
In the common Databricks workspace:
Sync Project-Specific Azure AD Groups with Databricks using Unity Catalog (UC) to manage access permissions.
Add users in the Admin Console with:
Workspace Access.
Databricks SQL Access.
Assign users to project-specific groups according to their roles and access requirements.
5. Compute and Pool Setup
Navigate to Compute -> Pools:
Create Project-Specific Pools.
Add appropriate tags and use standard runtime 11.3.
Navigate to Compute -> Policies:
Create a Project-Specific Policy linked to the secret scope and pool ID.
Navigate to Compute -> All-Purpose Compute:
Create a new Cluster using the project-specific policy.
Name the cluster according to the naming convention.
Enable Terminate after 30 minutes of inactivity.
6. Workspace and Repo Setup
Navigate to Workspace -> Create -> Folder:
Create a project-specific folder for organizing workspace content.
Navigate to Repos -> Create -> Folder:
Create a project-specific repository folder.
7. Access Control and Permissions
Grant permissions to project users on:
Compute resources.
Pools and policies.
Workspace folders and repositories.
Assign permissions as Read-Write (RW) or Read-Only (RO) based on project requirements.
Add the DevOps Agent VM Managed Identity to the ADB Project-Specific RW Group.