{"id":774,"date":"2025-08-19T16:02:03","date_gmt":"2025-08-19T16:02:03","guid":{"rendered":"https:\/\/dataopsschool.com\/blog\/?p=774"},"modified":"2025-08-19T16:46:22","modified_gmt":"2025-08-19T16:46:22","slug":"databricks-catalog-schemas-tables-with-external-location","status":"publish","type":"post","link":"https:\/\/dataopsschool.com\/blog\/databricks-catalog-schemas-tables-with-external-location\/","title":{"rendered":"Databricks &#8211; Catalog, Schemas &amp; Tables with External Location"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"501\" src=\"https:\/\/dataopsschool.com\/blog\/wp-content\/uploads\/2025\/08\/image-14-1024x501.png\" alt=\"\" class=\"wp-image-778\" srcset=\"https:\/\/dataopsschool.com\/blog\/wp-content\/uploads\/2025\/08\/image-14-1024x501.png 1024w, https:\/\/dataopsschool.com\/blog\/wp-content\/uploads\/2025\/08\/image-14-300x147.png 300w, https:\/\/dataopsschool.com\/blog\/wp-content\/uploads\/2025\/08\/image-14-768x376.png 768w, https:\/\/dataopsschool.com\/blog\/wp-content\/uploads\/2025\/08\/image-14.png 1408w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>this is <strong>exactly the core of Unity Catalog\u2019s object model<\/strong>. The way Databricks resolves <strong>storage paths for managed tables<\/strong> depends on <em>where<\/em> you attach the external\/managed location. Let\u2019s break it down carefully by <strong>Catalog \u2192 Schema \u2192 Table<\/strong> levels.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">\ud83d\udd11 Unity Catalog Storage Hierarchy<\/h1>\n\n\n\n<p>Unity Catalog object model:<br><strong>Metastore \u2192 Catalog \u2192 Schema \u2192 Table<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>You can set <strong>managed storage locations<\/strong> (via external locations) at <strong>Metastore<\/strong>, <strong>Catalog<\/strong>, or <strong>Schema<\/strong> level.<\/li>\n\n\n\n<li>Tables may also define a <strong>LOCATION<\/strong> clause (explicit path).<\/li>\n\n\n\n<li>Databricks applies a <strong>fallback precedence<\/strong>:<br><strong>Schema location (if exists) \u2192 Catalog location (if exists) \u2192 Metastore location (if exists).<\/strong><\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">1\ufe0f\u20e3 Catalog with External Location<\/h2>\n\n\n\n<p>When you <strong>create a catalog with an external location<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE CATALOG dev_ext\nMANAGED LOCATION 'abfss:\/\/data@&lt;storage_account&gt;.dfs.core.windows.net\/ADB\/catalog\/';\n<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Effect:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Any <strong>managed tables<\/strong> created inside this catalog (and its schemas) will, by default, write to this catalog\u2019s external location.<\/li>\n\n\n\n<li>The actual storage path will be: <code>\/ADB\/catalog\/&lt;catalog-id&gt;\/tables\/&lt;table-id&gt;\/<\/code><\/li>\n\n\n\n<li>If no schema-level location is set, the catalog\u2019s location is used.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Example:<\/strong> <code>CREATE SCHEMA dev_ext.bronze; CREATE TABLE dev_ext.bronze.sales (...) ;<\/code> \u2192 Data files stored under the <strong>catalog-level path<\/strong>.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">2\ufe0f\u20e3 Schema with External Location<\/h2>\n\n\n\n<p>When you <strong>create a schema with an external location<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE SCHEMA dev_ext.bronze_ext\nMANAGED LOCATION 'abfss:\/\/data@&lt;storage_account&gt;.dfs.core.windows.net\/ADB\/schema\/bronze_ext\/';\n<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Effect:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Any <strong>managed tables<\/strong> inside this schema are stored under the schema\u2019s external location.<\/li>\n\n\n\n<li>Schema location <strong>overrides the catalog location<\/strong>.<\/li>\n\n\n\n<li>Actual storage path: <code>\/ADB\/schema\/bronze_ext\/&lt;schema-id&gt;\/tables\/&lt;table-id&gt;\/<\/code><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Example:<\/strong> <code>CREATE TABLE dev_ext.bronze_ext.sales (...) ;<\/code> \u2192 Data files stored at the <strong>schema-level location<\/strong>.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">3\ufe0f\u20e3 Table with External Location<\/h2>\n\n\n\n<p>When you <strong>create a table with a LOCATION clause<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>CREATE TABLE dev_ext.bronze.sales_external (\n  id INT, product STRING, amount DOUBLE\n)\nLOCATION 'abfss:\/\/data@&lt;storage_account&gt;.dfs.core.windows.net\/ADB\/external-tables\/sales_external';\n<\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Effect:<\/strong>\n<ul class=\"wp-block-list\">\n<li>This is an <strong>external table<\/strong>, not a managed one.<\/li>\n\n\n\n<li>The data files live exactly at the path you specify.<\/li>\n\n\n\n<li>Databricks only manages <strong>metadata<\/strong> in Unity Catalog; it will never delete the underlying files if you drop the table.<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li><strong>Example:<\/strong>\n<ul class=\"wp-block-list\">\n<li>Drop the table \u2192 metadata gone, <strong>files remain<\/strong> at <code>\/ADB\/external-tables\/sales_external\/<\/code>.<\/li>\n\n\n\n<li>Undrop \u2192 metadata is restored (within 7 days).<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">\ud83d\udd04 Comparison Across Scenarios<\/h1>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><th>Scenario<\/th><th>Default Data Storage<\/th><th>Cleanup Behavior on <code>DROP TABLE<\/code><\/th><\/tr><\/thead><tbody><tr><td><strong>Catalog with external location<\/strong><\/td><td>Tables stored under catalog\u2019s managed path (<code>\/catalog\/&lt;catalog-id&gt;\/tables\/&lt;table-id&gt;\/<\/code>)<\/td><td>Dropping table: metadata removed, data files deleted <strong>after 7\u201330 days<\/strong> (grace period for undrop).<\/td><\/tr><tr><td><strong>Schema with external location<\/strong><\/td><td>Tables stored under schema\u2019s managed path (<code>\/schema\/&lt;schema-id&gt;\/tables\/&lt;table-id&gt;\/<\/code>)<\/td><td>Same as above: metadata gone immediately, data eventually deleted after retention period.<\/td><\/tr><tr><td><strong>Table with LOCATION (external table)<\/strong><\/td><td>Data stored exactly at the specified path (<code>\/external-tables\/&lt;name&gt;\/<\/code>)<\/td><td>Metadata dropped immediately, <strong>data files always remain<\/strong>. UC never deletes them.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">\u26a1 Key Takeaways<\/h1>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Metastore-level location<\/strong> = global default (used if neither catalog nor schema has a location).<\/li>\n\n\n\n<li><strong>Catalog-level location<\/strong> = default for all schemas under it (unless schema overrides).<\/li>\n\n\n\n<li><strong>Schema-level location<\/strong> = strongest override for managed tables.<\/li>\n\n\n\n<li><strong>Table with LOCATION<\/strong> = always external \u2192 UC does not manage lifecycle of data files.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p>\ud83d\udc49 So, in short:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Catalog external location<\/strong> = whole catalog defaults there.<\/li>\n\n\n\n<li><strong>Schema external location<\/strong> = just that schema overrides catalog.<\/li>\n\n\n\n<li><strong>Table external location<\/strong> = explicit \u2192 always external, metadata-only management.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>this is exactly the core of Unity Catalog\u2019s object model. The way Databricks resolves storage paths for managed tables depends on where you attach the external\/managed location&#8230;. <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-774","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/774","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=774"}],"version-history":[{"count":3,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/774\/revisions"}],"predecessor-version":[{"id":779,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/774\/revisions\/779"}],"wp:attachment":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=774"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=774"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=774"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}