We will:
- Create schemas at different levels (with and without external locations).
- Create managed tables inside these schemas.
- Validate where Databricks stores the underlying data.
- Explore how Unity Catalog organizes data using IDs in Catalog Explorer.
Unity Catalog has a 4-level hierarchy:
Metastore → Catalog → Schema → Table
- Location fallback:
- If no schema-level location → fallback to catalog.
- If no catalog-level location → fallback to metastore.
👉 Today we’ll create three schemas to see how Unity Catalog stores managed table data at different levels:
- Schema in Dev catalog (no external location).
- Schema in DevEXT catalog (catalog has external location).
- Schema in DevEXT catalog but with schema-level external location.
Finally, we’ll create tables in each schema and validate where the actual data files are stored.
Create Schema in Unity Catalog
1️⃣ Create schema under a catalog without external location (Dev)
CREATE SCHEMA Dev.bronze
COMMENT 'Bronze schema under Dev catalog without external location';
- Here,
Dev
catalog has no external location. - By default, managed table data will go to the metastore location.
✅ Verify in Databricks UI: expand Dev → bronze schema.
2️⃣ Create schema under a catalog with external location (DevEXT)
CREATE SCHEMA DevEXT.bronze
COMMENT 'Bronze schema in DevEXT catalog (catalog has external location)';
DevEXT
catalog already points to an external location.- Managed tables in this schema will store data at the catalog-level external path.
✅ Verify in UI: expand DevEXT → bronze schema.
Create Schema with External Location in Unity Catalog
Now, let’s override the storage location at the schema level.
Step 1: Create schema-level external location
We need a new folder in our ADLS storage:
/data/ADB/schema/
Create a corresponding external location in Databricks:
CREATE EXTERNAL LOCATION `ext_schema`
URL 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/schema/'
WITH STORAGE CREDENTIAL `SC_catalog_storage`;
Step 2: Create schema with managed location
CREATE SCHEMA DevEXT.bronze_ext
MANAGED LOCATION 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/schema/bronze_ext/'
COMMENT 'Bronze schema with schema-level external location';
- This schema’s managed tables will always store data in the schema external location.
✅ Verify in UI: under DevEXT → bronze_ext schema.
Validate Location for Managed Tables
Let’s now create a test managed table in each schema and check where data lands.
Example Table – Bronze Schema in Dev (no external location)
CREATE TABLE Dev.bronze.raw_sales (id INT, product STRING, amount DOUBLE);
INSERT INTO Dev.bronze.raw_sales VALUES (1, 'Laptop', 1200.50);
Check details:
DESCRIBE EXTENDED Dev.bronze.raw_sales;
📍 Location → Stored under Metastore location.
Path will look like:
/<metastore-id>/tables/<table-id>/
Example Table – Bronze Schema in DevEXT (catalog external location)
CREATE TABLE DevEXT.bronze.raw_sales (id INT, product STRING, amount DOUBLE);
INSERT INTO DevEXT.bronze.raw_sales VALUES (2, 'Phone', 750.00);
Check details:
DESCRIBE EXTENDED DevEXT.bronze.raw_sales;
📍 Location → Stored under catalog external location.
Path will look like:
/ADB/catalogs/<catalog-id>/tables/<table-id>/
Example Table – Bronze_EXT Schema in DevEXT (schema external location)
CREATE TABLE DevEXT.bronze_ext.raw_sales (id INT, product STRING, amount DOUBLE);
INSERT INTO DevEXT.bronze_ext.raw_sales VALUES (3, 'Tablet', 500.00);
Check details:
DESCRIBE EXTENDED DevEXT.bronze_ext.raw_sales;
📍 Location → Stored under schema external location.
Path will look like:
/ADB/schema/bronze_ext/<schema-id>/tables/<table-id>/
Databricks Catalog Explorer
You can visually confirm all this in Catalog Explorer inside Databricks:
- Go to Catalog → Dev → bronze → raw_sales → Details
- Shows Metastore path and Table ID.
- Go to Catalog → DevEXT → bronze → raw_sales → Details
- Shows Catalog-level path.
- Go to Catalog → DevEXT → bronze_ext → raw_sales → Details
- Shows Schema-level path.
🔎 Note:
- Unity Catalog does not store tables by name, instead uses Table IDs in the storage path.
- You can find Schema ID, Catalog ID, and Table ID from the Catalog Explorer.
- Dev.bronze → data stored at Metastore level.
- DevEXT.bronze → data stored at Catalog external location.
- DevEXT.bronze_ext → data stored at Schema external location.
Unity Catalog provides flexible storage mapping:
- Metastore-level (fallback).
- Catalog-level (optional).
- Schema-level (optional, overrides catalog).
- Table-level (for external tables).
👉 In the next part (as per video series), you’ll explore External Tables and operations like DROP/UNDROP.
Would you like me to also create a diagram of the fallback hierarchy (Metastore → Catalog → Schema → Table) so you can visualize how data location resolution works across these levels?