Databricks Lab – Managed vs External Tables + UNDROP (with External Location setup)


Databricks Unity Catalog Tutorial

Managed vs External Tables + UNDROP (with External Location setup)

Introduction (what we’ll build)

You’ll learn to:

  1. Create (or reuse) an external location for tables.
  2. Create one managed table and one external table in the same schema.
  3. Inspect their storage locations and types.
  4. DROP both tables and recover them using UNDROP (by name and by ID).

Prereqs

  • A Unity Catalog–enabled cluster.
  • A catalog Dev and schema bronze you can write to.
  • A storage credential already created (e.g., SC_catalog_storage).
  • Proper permissions: USE CATALOG, USE SCHEMA, CREATE TABLE, etc.
  • You’re on Azure (paths use abfss://). Adjust for S3/GCS if needed.

What’s new in Databricks? (Updates & Releases)

In the workspace, click ? → What’s new to see the monthly release notes (features are rolled out in stages). For reference: August 2024 release notes pages: Azure and Databricks docs. (Microsoft Learn, Databricks Documentation)


Create External Location in Databricks

A) Make a folder in ADLS for external tables

In your ADLS Gen2 container (e.g., data), create:

/ADB/external-tables/

You’ll store all external table data here (the video used externaltables—name doesn’t matter).

B) (If needed) Reuse or create a storage credential

If you already have one (e.g., SC_catalog_storage), reuse it. Otherwise, create it via Catalog → Storage credentials (Azure Managed Identity or SAS/Key as per your setup). Docs: Create external locations (Azure/AWS). (Microsoft Learn, Databricks Documentation)

C) Create the external location (UI or SQL)

UI path (as in the video):
Catalog → External dataExternal locationsCreate

  • Name: ext_tables
  • Storage credential: SC_catalog_storage
  • URL: abfss://data@<storage_account>.dfs.core.windows.net/ADB/external-tables/
  • CreateTest connection (should show all permissions OK).

SQL equivalent (does the same thing):

CREATE EXTERNAL LOCATION `ext_tables`
URL 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/external-tables/'
WITH STORAGE CREDENTIAL `SC_catalog_storage`;

Reference: SQL syntax for CREATE EXTERNAL LOCATION. (Databricks Documentation)


03:37 — Managed Tables vs External Tables (same schema)

We’ll use catalog Dev and schema bronze.

Tip: attach your SQL notebook to a Unity Catalog–enabled cluster before running.

1) Create a managed table and insert a row

-- Managed table (no LOCATION clause)
CREATE TABLE Dev.bronze.sales_managed (
  id INT,
  product STRING,
  amount DOUBLE
);

INSERT INTO Dev.bronze.sales_managed VALUES (1, 'Laptop', 1200.50);

2) Create an external table and insert a row

External table requires a LOCATION pointing into your external location path.

-- External table (explicit LOCATION under the external location root)
CREATE TABLE Dev.bronze.sales_external (
  id INT,
  product STRING,
  amount DOUBLE
)
LOCATION 'abfss://data@<storage_account>.dfs.core.windows.net/ADB/external-tables/sales_external';

INSERT INTO Dev.bronze.sales_external VALUES (2, 'Phone', 750.00);

3) Verify each table’s type and storage path

-- Managed table details
DESCRIBE EXTENDED Dev.bronze.sales_managed;

-- External table details
DESCRIBE EXTENDED Dev.bronze.sales_external;
  • sales_managedtableType = MANAGED, path resolves to your metastore managed storage (Unity Catalog-managed).
  • sales_externaltableType = EXTERNAL, path is exactly the LOCATION you specified under /ADB/external-tables/.
    Background docs on table types. (Databricks Documentation)

07:07 — UNDROP in Databricks (recover dropped tables)

Now we’ll drop both tables and recover them.

1) Drop the tables

DROP TABLE Dev.bronze.sales_managed;
DROP TABLE Dev.bronze.sales_external;
  • In Unity Catalog, dropping a managed table does not immediately delete data from cloud storage; it’s eligible for cleanup within ~7–30 days.
  • You can recover (UNDROP) a dropped relation within 7 days. (Microsoft Learn, Databricks Documentation)

2) List dropped tables (get names/IDs)

USE CATALOG Dev;

-- See dropped tables for a schema
SHOW TABLES DROPPED IN Dev.bronze;

You’ll get rows including tablename, tableid (UUID), deletedat, and more. (Microsoft Learn)

3) UNDROP by name

UNDROP TABLE Dev.bronze.sales_managed;
  • Parent catalog and schema must still exist.
  • Recovers the most recently dropped relation with that name in the schema. (Databricks Documentation)

4) UNDROP by ID (alternative)

Copy the tableid for sales_external from SHOW TABLES DROPPED, then:

UNDROP TABLE WITH ID '<table_uuid_from_show_tables_dropped>';

The WITH ID form is handy if a new table with the same name was recreated and blocks name-based recovery (in that case you can rename the new one or use the ID). (Databricks Documentation, kb.databricks.com)

Keep in mind

  • UNDROP works for both managed and external relations in Unity Catalog.
  • Managed: data is soft-retained for up to the 7-day recovery window, then scheduled for deletion (within ~7–30 days).
  • External: data files always remain at your external path; dropping only removes UC metadata. You can still UNDROP within 7 days to re-register. (Databricks Documentation, Microsoft Learn)

(Optional) Quick sanity checks in storage

  • Managed table (after drop, before cleanup): you’ll still see data under the Unity Catalog managed path (e.g., metastore-scoped “tables/<table-id>/”). Cleanup is deferred. (Microsoft Learn)
  • External table: data remains at /ADB/external-tables/sales_external/ regardless; dropping only removes the table’s entry from the catalog. (Databricks Documentation)

Clean-up (optional)

-- Re-drop if you restored them
DROP TABLE IF EXISTS Dev.bronze.sales_managed;
DROP TABLE IF EXISTS Dev.bronze.sales_external;

References (docs used)