Databricks Lab – Working with Schemas and External Locations


We will:

  1. Create schemas at different levels (with and without external locations).
  2. Create managed tables inside these schemas.
  3. Validate where Databricks stores the underlying data.
  4. Explore how Unity Catalog organizes data using IDs in Catalog Explorer.

Unity Catalog has a 4-level hierarchy:

Metastore → Catalog → Schema → Table

  • Location fallback:
    • If no schema-level location → fallback to catalog.
    • If no catalog-level location → fallback to metastore.

👉 Today we’ll create three schemas to see how Unity Catalog stores managed table data at different levels:

  1. Schema in Dev catalog (no external location).
  2. Schema in DevEXT catalog (catalog has external location).
  3. Schema in DevEXT catalog but with schema-level external location.

Finally, we’ll create tables in each schema and validate where the actual data files are stored.


Create Schema in Unity Catalog

1️⃣ Create schema under a catalog without external location (Dev)

CREATE SCHEMA Dev.bronze
COMMENT 'Bronze schema under Dev catalog without external location';
  • Here, Dev catalog has no external location.
  • By default, managed table data will go to the metastore location.

✅ Verify in Databricks UI: expand Dev → bronze schema.


2️⃣ Create schema under a catalog with external location (DevEXT)

CREATE SCHEMA DevEXT.bronze
COMMENT 'Bronze schema in DevEXT catalog (catalog has external location)';
  • DevEXT catalog already points to an external location.
  • Managed tables in this schema will store data at the catalog-level external path.

✅ Verify in UI: expand DevEXT → bronze schema.


Create Schema with External Location in Unity Catalog

Now, let’s override the storage location at the schema level.

Step 1: Create schema-level external location

We need a new folder in our ADLS storage:

/data/ADB/schema/

Create a corresponding external location in Databricks:

CREATE EXTERNAL LOCATION `ext_schema`
URL 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/schema/'
WITH STORAGE CREDENTIAL `SC_catalog_storage`;

Step 2: Create schema with managed location

CREATE SCHEMA DevEXT.bronze_ext
MANAGED LOCATION 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/schema/bronze_ext/'
COMMENT 'Bronze schema with schema-level external location';
  • This schema’s managed tables will always store data in the schema external location.

✅ Verify in UI: under DevEXT → bronze_ext schema.


Validate Location for Managed Tables

Let’s now create a test managed table in each schema and check where data lands.

Example Table – Bronze Schema in Dev (no external location)

CREATE TABLE Dev.bronze.raw_sales (id INT, product STRING, amount DOUBLE);

INSERT INTO Dev.bronze.raw_sales VALUES (1, 'Laptop', 1200.50);

Check details:

DESCRIBE EXTENDED Dev.bronze.raw_sales;

📍 Location → Stored under Metastore location.
Path will look like:

/<metastore-id>/tables/<table-id>/

Example Table – Bronze Schema in DevEXT (catalog external location)

CREATE TABLE DevEXT.bronze.raw_sales (id INT, product STRING, amount DOUBLE);

INSERT INTO DevEXT.bronze.raw_sales VALUES (2, 'Phone', 750.00);

Check details:

DESCRIBE EXTENDED DevEXT.bronze.raw_sales;

📍 Location → Stored under catalog external location.
Path will look like:

/ADB/catalogs/<catalog-id>/tables/<table-id>/

Example Table – Bronze_EXT Schema in DevEXT (schema external location)

CREATE TABLE DevEXT.bronze_ext.raw_sales (id INT, product STRING, amount DOUBLE);

INSERT INTO DevEXT.bronze_ext.raw_sales VALUES (3, 'Tablet', 500.00);

Check details:

DESCRIBE EXTENDED DevEXT.bronze_ext.raw_sales;

📍 Location → Stored under schema external location.
Path will look like:

/ADB/schema/bronze_ext/<schema-id>/tables/<table-id>/

Databricks Catalog Explorer

You can visually confirm all this in Catalog Explorer inside Databricks:

  1. Go to Catalog → Dev → bronze → raw_sales → Details
    • Shows Metastore path and Table ID.
  2. Go to Catalog → DevEXT → bronze → raw_sales → Details
    • Shows Catalog-level path.
  3. Go to Catalog → DevEXT → bronze_ext → raw_sales → Details
    • Shows Schema-level path.

🔎 Note:

  • Unity Catalog does not store tables by name, instead uses Table IDs in the storage path.
  • You can find Schema ID, Catalog ID, and Table ID from the Catalog Explorer.

  1. Dev.bronze → data stored at Metastore level.
  2. DevEXT.bronze → data stored at Catalog external location.
  3. DevEXT.bronze_ext → data stored at Schema external location.

Unity Catalog provides flexible storage mapping:

  • Metastore-level (fallback).
  • Catalog-level (optional).
  • Schema-level (optional, overrides catalog).
  • Table-level (for external tables).

👉 In the next part (as per video series), you’ll explore External Tables and operations like DROP/UNDROP.


Would you like me to also create a diagram of the fallback hierarchy (Metastore → Catalog → Schema → Table) so you can visualize how data location resolution works across these levels?