Databricks: Set Up Metastore & Map Azure Storage Account with Access Connector, Enable Unity Catalog

This guide walks you through setting up a Unity Catalog metastore in Azure Databricks, connecting it securely to an Azure storage account using the Access Connector, validating the setup, and enabling Unity Catalog for your Databricks workspace.


Step 1: Create a Storage Account and Container for Metastore

  1. Navigate to Azure Portal and create an Azure Data Lake Storage Gen2 account (must be in the same region as your Databricks workspace).
  2. Add a container for metastore-level storage. For example: mycontainer in mydatalakestorage.
  3. Note down the ADLS Gen2 URI: textabfss://mycontainer@mydatalakestorage.dfs.core.windows.net/unity-metastore This will be the root path for managed tables and volumes.

Step 2: Create Access Connector (Managed Identity) for Azure Databricks

  1. In Azure Portal:
    • Click “Create a resource” → Search for “Access Connector for Azure Databricks”.
    • Click Create.
    • Choose Subscription, Resource Group, Region, and enter a connector name (e.g., unity-access-connector).
    • On the Managed Identity tab, select System-assigned managed identity (recommended).
    • Click Review + Create.
  2. Grant Storage Access:
    • Assign the managed identity Storage Blob Data Contributor or higher role on the storage account or precise container.
    • This enables Databricks to write/read data.
  3. Note the resource ID: text/subscriptions/<sub_id>/resourceGroups/<rg>/providers/Microsoft.Databricks/accessConnectors/<name>

Step 3: Create the Metastore in Databricks and Link Storage

  1. Log in to the Databricks Account Console (), as Account Admin.
  2. Go to Catalog.
  3. Click “Create Metastore.”
  4. Fill the form:
    • Name for the metastore
    • Region (match storage, workspace)
    • ADLS Gen2 path: mycontainer@mydatalakestorage.dfs.core.windows.net/unity-metastore
    • Access Connector ID: Paste the resource ID from previous step
  5. Click “Create.”

Step 4: Link Workspaces to the Metastore & Enable Unity Catalog

  1. After metastore creation, select workspaces to assign to the metastore.
    • Alternatively, return to Account Console → Catalog → Metastore → Workspaces tab → “Assign to workspace”.
  2. Confirm assignment: The workspace is Unity Catalog enabled.

Step 5: Validation Steps

  • Workspace Validation:
    • In Databricks Workspace UI, navigate to Data. Confirm you see Unity Catalog concepts (Catalogs, Schemas).
  • Storage Validation:
    • Access data via Unity Catalog and verify files/folders are created in the designated ADLS Gen2 storage account/container.
  • Security Validation:
    • Ensure data access controls and audits appear in the Catalog Explorer.
  • Metastore Validation:
    • Run a simple create table command in Databricks and check physical storage and permission enforcement.

Step 6: Setup Unity Catalog Objects

  1. Create Catalogs, Schemas, and Tables using SQL in the workspace: sqlCREATE CATALOG my_catalog; CREATE SCHEMA my_catalog.my_schema; CREATE TABLE my_catalog.my_schema.my_table (id INT, name STRING);
    • Data will be stored in the mapped container, managed by Unity Catalog.
  2. Create External Locations if needed:
    • For additional storage, register external locations for external tables.

Key Notes & Troubleshooting

  • Global Admin permissions may be required for setup; SCIM connector recommended for user/group sync.
  • You must co-locate storage, connector, metastore, and workspaces in the same region.
  • Managed identities via Access Connector are preferred over service principals for security and simplicity.

By following these steps, you will secure, govern, and validate storage and workspace integration with Unity Catalog in Azure Databricks, ready for enterprise-scale Lakehouse governance.

Related Posts

Understanding DataOps Metrics for Beginners: Measuring Data Pipeline Performance

Introduction Modern enterprises run on data. Every second, massive volumes of information flow from transactional databases, cloud applications, IoT devices, and external APIs into central data warehouses…

Read More

A Complete Guide to Medical Second Opinions and Healthcare Travel Planning

Introduction Managing a sudden or complex medical diagnosis can feel completely overwhelming. For many patients, finding the right treatment path locally comes with unexpected barriers, such as…

Read More

Discover How MyHospitalNow Helps Patients Find Verified Hospitals and Compare Medical Costs

Introduction The lack of healthcare cost transparency is one of the most significant burdens facing modern patients. Medical bills are notoriously unpredictable, and price dispersion for the…

Read More

Engineering Resilient Pipelines: Monitoring and Observability in DataOps

Modern data engineering is no longer just about moving data from point A to point B. As organizations scale, their data architectures transform into complex networks of…

Read More

Modern DataOps orchestration tools for enterprise pipeline automation strategies

Introduction The modern enterprise data ecosystem is undergoing an unprecedented expansion. Organizations no longer ingest data from a couple of centralized relational databases. Instead, a standard production…

Read More

Ultimate Guide to DataOps Data Quality Best Practices for Engineering Teams

Introduction In the contemporary corporate landscape, an organization’s computational infrastructure functions as its nervous system. Strategic choices—ranging from inventory optimization algorithms to hyper-targeted marketing campaigns—rely entirely on…

Read More

Leave a Reply