Databricks Lab – Working with Schemas and External Locations


We will:

  1. Create schemas at different levels (with and without external locations).
  2. Create managed tables inside these schemas.
  3. Validate where Databricks stores the underlying data.
  4. Explore how Unity Catalog organizes data using IDs in Catalog Explorer.

Unity Catalog has a 4-level hierarchy:

Metastore → Catalog → Schema → Table

  • Location fallback:
    • If no schema-level location → fallback to catalog.
    • If no catalog-level location → fallback to metastore.

👉 Today we’ll create three schemas to see how Unity Catalog stores managed table data at different levels:

  1. Schema in Dev catalog (no external location).
  2. Schema in DevEXT catalog (catalog has external location).
  3. Schema in DevEXT catalog but with schema-level external location.

Finally, we’ll create tables in each schema and validate where the actual data files are stored.


Create Schema in Unity Catalog

1️⃣ Create schema under a catalog without external location (Dev)

CREATE SCHEMA Dev.bronze
COMMENT 'Bronze schema under Dev catalog without external location';
  • Here, Dev catalog has no external location.
  • By default, managed table data will go to the metastore location.

✅ Verify in Databricks UI: expand Dev → bronze schema.


2️⃣ Create schema under a catalog with external location (DevEXT)

CREATE SCHEMA DevEXT.bronze
COMMENT 'Bronze schema in DevEXT catalog (catalog has external location)';
  • DevEXT catalog already points to an external location.
  • Managed tables in this schema will store data at the catalog-level external path.

✅ Verify in UI: expand DevEXT → bronze schema.


Create Schema with External Location in Unity Catalog

Now, let’s override the storage location at the schema level.

Step 1: Create schema-level external location

We need a new folder in our ADLS storage:

/data/ADB/schema/

Create a corresponding external location in Databricks:

CREATE EXTERNAL LOCATION `ext_schema`
URL 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/schema/'
WITH STORAGE CREDENTIAL `SC_catalog_storage`;

Step 2: Create schema with managed location

CREATE SCHEMA DevEXT.bronze_ext
MANAGED LOCATION 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/schema/bronze_ext/'
COMMENT 'Bronze schema with schema-level external location';
  • This schema’s managed tables will always store data in the schema external location.

✅ Verify in UI: under DevEXT → bronze_ext schema.


Validate Location for Managed Tables

Let’s now create a test managed table in each schema and check where data lands.

Example Table – Bronze Schema in Dev (no external location)

CREATE TABLE Dev.bronze.raw_sales (id INT, product STRING, amount DOUBLE);

INSERT INTO Dev.bronze.raw_sales VALUES (1, 'Laptop', 1200.50);

Check details:

DESCRIBE EXTENDED Dev.bronze.raw_sales;

📍 Location → Stored under Metastore location.
Path will look like:

/<metastore-id>/tables/<table-id>/

Example Table – Bronze Schema in DevEXT (catalog external location)

CREATE TABLE DevEXT.bronze.raw_sales (id INT, product STRING, amount DOUBLE);

INSERT INTO DevEXT.bronze.raw_sales VALUES (2, 'Phone', 750.00);

Check details:

DESCRIBE EXTENDED DevEXT.bronze.raw_sales;

📍 Location → Stored under catalog external location.
Path will look like:

/ADB/catalogs/<catalog-id>/tables/<table-id>/

Example Table – Bronze_EXT Schema in DevEXT (schema external location)

CREATE TABLE DevEXT.bronze_ext.raw_sales (id INT, product STRING, amount DOUBLE);

INSERT INTO DevEXT.bronze_ext.raw_sales VALUES (3, 'Tablet', 500.00);

Check details:

DESCRIBE EXTENDED DevEXT.bronze_ext.raw_sales;

📍 Location → Stored under schema external location.
Path will look like:

/ADB/schema/bronze_ext/<schema-id>/tables/<table-id>/

Databricks Catalog Explorer

You can visually confirm all this in Catalog Explorer inside Databricks:

  1. Go to Catalog → Dev → bronze → raw_sales → Details
    • Shows Metastore path and Table ID.
  2. Go to Catalog → DevEXT → bronze → raw_sales → Details
    • Shows Catalog-level path.
  3. Go to Catalog → DevEXT → bronze_ext → raw_sales → Details
    • Shows Schema-level path.

🔎 Note:

  • Unity Catalog does not store tables by name, instead uses Table IDs in the storage path.
  • You can find Schema ID, Catalog ID, and Table ID from the Catalog Explorer.

  1. Dev.bronze → data stored at Metastore level.
  2. DevEXT.bronze → data stored at Catalog external location.
  3. DevEXT.bronze_ext → data stored at Schema external location.

Unity Catalog provides flexible storage mapping:

  • Metastore-level (fallback).
  • Catalog-level (optional).
  • Schema-level (optional, overrides catalog).
  • Table-level (for external tables).

👉 In the next part (as per video series), you’ll explore External Tables and operations like DROP/UNDROP.


Would you like me to also create a diagram of the fallback hierarchy (Metastore → Catalog → Schema → Table) so you can visualize how data location resolution works across these levels?

Related Posts

Strategic Cloud Financial Management With Certified FinOps Professional Training

Introduction The Certified FinOps Professional program is a transformative milestone for any engineer or manager looking to master the intersection of finance, technology, and business operations. This…

Read More

Professional Certified FinOps Engineer improves financial performance visibility systems

Introduction In the modern landscape of cloud infrastructure, technical expertise alone is no longer sufficient to drive enterprise success. The Certified FinOps Engineer program has emerged as…

Read More

Complete Cloud Financial Management Guide for Certified FinOps Manager

Introduction The Certified FinOps Manager program is designed to bridge the widening gap between cloud engineering and financial accountability. As cloud environments become more complex, organizations require…

Read More

Industry Ready FinOps Knowledge Through Certified FinOps Architect Program

Introduction The Certified FinOps Architect certification is designed to help professionals bridge the gap between cloud financial management and operational efficiency. This guide is tailored for working…

Read More

Advance Your Data Management Career with CDOM – Certified DataOps Manager

The CDOM – Certified DataOps Manager is a breakthrough certification designed for professionals who want to master the intersection of data engineering and operational agility. This guide…

Read More

Future focused learning with CDOA – Certified DataOps Architect certification

Introduction The CDOA – Certified DataOps Architect is a professional designed to bridge the gap between data engineering and operational excellence. This guide is written for engineers…

Read More