{"id":834,"date":"2025-09-07T14:07:06","date_gmt":"2025-09-07T14:07:06","guid":{"rendered":"https:\/\/dataopsschool.com\/blog\/?p=834"},"modified":"2025-09-07T14:07:09","modified_gmt":"2025-09-07T14:07:09","slug":"databricks-hands-on-tutorial-for-dlt-data-quality-expectations","status":"publish","type":"post","link":"https:\/\/dataopsschool.com\/blog\/databricks-hands-on-tutorial-for-dlt-data-quality-expectations\/","title":{"rendered":"Databricks: hands-on tutorial for DLT Data Quality &amp; Expectations"},"content":{"rendered":"\n<p>Here\u2019s a complete, <strong>hands-on tutorial<\/strong> for <strong>DLT Data Quality &amp; Expectations<\/strong> \u2014 including how to <strong>define rules<\/strong>, use <strong>warning \/ fail \/ drop<\/strong> actions, and <strong>monitor a DLT pipeline with SQL<\/strong> for observability. I\u2019ve aligned the flow (and examples) to your timestamped outline so it reads like a video transcript you can paste into docs, slides, or a blog.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Introduction<\/h1>\n\n\n\n<p>In <strong>Delta Live Tables (DLT)<\/strong>, <em>Expectations<\/em> are declarative data-quality rules you apply to each record as it flows through your pipeline. You can validate schemas, ranges, enumerations, nullability, referential integrity, and more. DLT then:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Counts<\/strong> how many rows passed\/failed.<\/li>\n\n\n\n<li><strong>Decides<\/strong> per rule whether to <strong>allow<\/strong>, <strong>drop<\/strong>, or <strong>fail<\/strong> the pipeline.<\/li>\n\n\n\n<li><strong>Surfaces metrics<\/strong> and <strong>event logs<\/strong> you can query and visualize.<\/li>\n<\/ul>\n\n\n\n<p>We\u2019ll build a small pipeline with <strong>Orders<\/strong> and <strong>Customers<\/strong> to demonstrate:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Defining rules <strong>once<\/strong> (as a Python dict) and <strong>reusing<\/strong> them.<\/li>\n\n\n\n<li>Applying rules in <strong>warning<\/strong>, <strong>fail<\/strong>, and <strong>drop<\/strong> modes.<\/li>\n\n\n\n<li>Monitoring <strong>DLT event logs<\/strong> with SQL to track DQ failures over time.<\/li>\n<\/ul>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Pre-reqs (quick):<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Create a DLT pipeline in Databricks; set Product edition to <strong>Advanced<\/strong> (Expectations need Advanced).<\/li>\n\n\n\n<li>Attach to a cluster policy that allows DLT (or use serverless DLT, if enabled).<\/li>\n\n\n\n<li>Use a Notebook language: <strong>Python<\/strong> for the examples below (SQL equivalents included later).<\/li>\n<\/ol>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">What are Expectations in Databricks DLT?<\/h1>\n\n\n\n<p><strong>Expectation<\/strong> = <em>named predicate<\/em> + <em>action<\/em>.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Name<\/strong>: unique within the dataset (table\/view)<\/li>\n\n\n\n<li><strong>Predicate<\/strong>: a boolean SQL expression (e.g., <code>order_price > 0<\/code>)<\/li>\n\n\n\n<li><strong>Action<\/strong> (per rule):\n<ul class=\"wp-block-list\">\n<li><strong>Warning (default)<\/strong>: log failures; <strong>do not<\/strong> block or drop rows<\/li>\n\n\n\n<li><strong>Drop<\/strong>: <strong>exclude<\/strong> failing rows from the output<\/li>\n\n\n\n<li><strong>Fail<\/strong>: <strong>stop<\/strong> the pipeline (task fails)<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<p>Expectations can be attached to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>DLT Tables<\/strong> (<code>@dlt.table<\/code>)<\/li>\n\n\n\n<li><strong>DLT Views<\/strong> (<code>@dlt.view<\/code>)<\/li>\n\n\n\n<li><strong>Joined\/derived datasets<\/strong> (so you can validate <em>post-join<\/em> quality, too)<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">How to define rules for Expectations in DLT?<\/h1>\n\n\n\n<p>Below, we\u2019ll define reusable rule sets as <strong>Python dictionaries<\/strong>, then attach them with DLT decorators.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Example datasets (conceptual)<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Orders<\/strong>: <code>order_key<\/code>, <code>order_status<\/code> in (<code>'O','F','P'<\/code>), <code>order_price<\/code> > 0<\/li>\n\n\n\n<li><strong>Customers<\/strong>: <code>cust_id<\/code>, <code>mkt_segment<\/code> <strong>NOT NULL<\/strong><\/li>\n<\/ul>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>You can source data from Autoloader, streams, or tables. For clarity, we\u2019ll read from bronze tables (or views) you maintain in your setup notebook.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">Python: rule dictionaries<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>import dlt\nfrom pyspark.sql import functions as F\n\n# Reusable rule sets\norder_rules = {\n  \"valid_order_status\": \"order_status IN ('O','F','P')\",\n  \"valid_order_price\":  \"order_price &gt; 0\"\n}\n\ncustomer_rules = {\n  \"valid_mkt_segment\": \"mkt_segment IS NOT NULL\"\n}\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Tips<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Keep rule names <strong>snake_case<\/strong> and descriptive.<\/li>\n\n\n\n<li>Expressions are <strong>SQL strings<\/strong> (not Python boolean expressions).<\/li>\n\n\n\n<li>You can define <strong>one or many<\/strong> rules per dataset.<\/li>\n<\/ul>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Expectation action: Warning<\/h1>\n\n\n\n<p><strong>Warning<\/strong> is the default behavior when you attach expectations with <code>@dlt.expect<\/code> or <code>@dlt.expect_all<\/code>. DLT records failures in <strong>metrics<\/strong> and <strong>event logs<\/strong>, but <strong>keeps<\/strong> the rows.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Python: apply rules in <strong>warning<\/strong> mode<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>@dlt.table(name=\"orders_bronze\")\n@dlt.expect_all(order_rules)         # default = WARNING (aka \"allow\")\ndef orders_bronze():\n    # Replace this with your real source (e.g., cloud_files or a table)\n    return spark.table(\"source.orders_raw\")\n\n@dlt.view(name=\"customers_v\")\n@dlt.expect_all(customer_rules)      # default = WARNING\ndef customers_v():\n    return spark.table(\"source.customers_raw\")\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">What happens at runtime?<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>DLT <strong>counts<\/strong> failures for each rule and shows them under the dataset\u2019s <strong>Data quality<\/strong> tab.<\/li>\n\n\n\n<li>Rows <strong>still land<\/strong> in <code>orders_bronze<\/code> \/ are visible in downstream datasets.<\/li>\n<\/ul>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Use case: <strong>Profiling<\/strong> a new source without disrupting downstream consumers. Warning helps you <strong>discover<\/strong> issues first.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Expectation action: Fail<\/h1>\n\n\n\n<p>Use <strong>fail<\/strong> when any failing row should <strong>stop<\/strong> your pipeline. Great for <strong>strict contracts<\/strong> and <strong>mission-critical<\/strong> pipelines.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Python: <strong>fail<\/strong> per rule set<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>@dlt.table(name=\"orders_bronze_fail_stop\")\n@dlt.expect_all_or_fail(order_rules)  # if any rule fails =&gt; task fails\ndef orders_bronze_fail_stop():\n    return spark.table(\"source.orders_raw\")\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Alternative: per-rule fail<\/p>\n<\/blockquote>\n\n\n\n<pre class=\"wp-block-code\"><code>@dlt.table(name=\"orders_bronze_fail_rule\")\n@dlt.expect_or_fail(\"valid_order_status\", \"order_status IN ('O','F','P')\")\n@dlt.expect(\"valid_order_price\", \"order_price &gt; 0\")  # warning for this one\ndef orders_bronze_fail_rule():\n    return spark.table(\"source.orders_raw\")\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Use case: <strong>SLA-backed<\/strong> pipelines where invalid data must <strong>never<\/strong> progress.<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">Expectation action: Drop<\/h1>\n\n\n\n<p>Use <strong>drop<\/strong> when you want to <strong>filter out<\/strong> bad rows but keep the pipeline running. This is common in <strong>schema drift<\/strong> or <strong>edge-case<\/strong> scenarios.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Python: <strong>drop<\/strong> per rule set<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>@dlt.table(name=\"orders_bronze_dropped\")\n@dlt.expect_all_or_drop(order_rules)   # failing rows are excluded\ndef orders_bronze_dropped():\n    return spark.table(\"source.orders_raw\")\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Alternative: per-rule drop<\/p>\n<\/blockquote>\n\n\n\n<pre class=\"wp-block-code\"><code>@dlt.table(name=\"orders_bronze_mixed\")\n@dlt.expect_or_drop(\"valid_order_status\", \"order_status IN ('O','F','P')\")\n@dlt.expect(\"valid_order_price\", \"order_price &gt; 0\")  # warning for price\ndef orders_bronze_mixed():\n    return spark.table(\"source.orders_raw\")\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Use case: allow the pipeline to <strong>continue<\/strong> while <strong>quarantining<\/strong> bad records (e.g., route them later for remediation).<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Applying expectations on joins \/ views<\/h2>\n\n\n\n<p>You can attach expectations <strong>after transformations<\/strong> (e.g., joins), which validates <strong>post-join<\/strong> quality:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>@dlt.view(name=\"orders_customers_join_v\")\n@dlt.expect_all({\n  \"post_join_status\": \"order_status IN ('O','F','P')\",\n  \"post_join_price\":  \"order_price &gt; 0\",\n  \"post_join_segment\": \"mkt_segment IS NOT NULL\"\n})\ndef orders_customers_join_v():\n    orders = dlt.read(\"orders_bronze_dropped\")\n    custs  = dlt.read_stream(\"customers_v\")   # view can be read as stream\n    return (orders.alias(\"o\")\n                  .join(custs.alias(\"c\"), F.col(\"o.cust_id\")==F.col(\"c.cust_id\"), \"left\"))\n<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">SQL equivalents (DLT SQL)<\/h2>\n\n\n\n<p>If you prefer <strong>DLT SQL<\/strong> in the same pipeline:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Warning (default)\nCREATE OR REFRESH STREAMING LIVE TABLE orders_bronze_sql\nCONSTRAINT valid_order_status EXPECT (order_status IN ('O','F','P'))\nCONSTRAINT valid_order_price  EXPECT (order_price &gt; 0)\nAS SELECT * FROM source.orders_raw;\n\n-- Fail\nCREATE OR REFRESH STREAMING LIVE TABLE orders_bronze_fail_sql\nCONSTRAINT valid_order_status EXPECT (order_status IN ('O','F','P')) ON VIOLATION FAIL UPDATE\nCONSTRAINT valid_order_price  EXPECT (order_price &gt; 0)               ON VIOLATION FAIL UPDATE\nAS SELECT * FROM source.orders_raw;\n\n-- Drop\nCREATE OR REFRESH STREAMING LIVE TABLE orders_bronze_drop_sql\nCONSTRAINT valid_order_status EXPECT (order_status IN ('O','F','P')) ON VIOLATION DROP ROW\nCONSTRAINT valid_order_price  EXPECT (order_price &gt; 0)               ON VIOLATION DROP ROW\nAS SELECT * FROM source.orders_raw;\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Notes:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>The SQL syntax uses <code>CONSTRAINT name EXPECT (predicate)<\/code> with <code>ON VIOLATION DROP ROW | FAIL UPDATE<\/code>.<\/li>\n\n\n\n<li>If you omit <code>ON VIOLATION<\/code>, it behaves like <strong>warning<\/strong> (allow).<\/li>\n<\/ul>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Inserting test records (quick setup idea)<\/h2>\n\n\n\n<p>In a separate <strong>setup notebook<\/strong>, you can seed edge cases to see each action:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- example seed data; in practice, your bronze comes from Autoloader or ingestion jobs\nCREATE TABLE IF NOT EXISTS source.orders_raw AS\nSELECT * FROM VALUES\n  (1001, 'O',  125.00,  1),\n  (1002, 'NA',  30.00,  2),     -- invalid status\n  (1003, 'P',  -10.00,  3),     -- invalid price\n  (1004, NULL,  99.99,  4)      -- invalid status (NULL)\nAS T(order_key, order_status, order_price, cust_id);\n\nCREATE TABLE IF NOT EXISTS source.customers_raw AS\nSELECT * FROM VALUES\n  (1, 'SMALL BIZ'),\n  (2, NULL),                    -- invalid segment\n  (3, 'ENTERPRISE'),\n  (4, 'CONSUMER')\nAS T(cust_id, mkt_segment);\n<\/code><\/pre>\n\n\n\n<p>Re-run the pipeline to observe <strong>warning\/fail\/drop<\/strong> effects.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h1 class=\"wp-block-heading\">How to Monitor DLT pipelines (Observability)<\/h1>\n\n\n\n<p>DLT emits rich <strong>event logs<\/strong> you can <strong>query with SQL<\/strong> to build <strong>dashboards<\/strong> (SQL Warehouse, Databricks Dashboards, Lakeview, etc.).<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Find the <strong>Pipeline ID<\/strong>: open your DLT pipeline \u2192 copy the ID from the UI.<\/p>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">1) Raw event log (table-valued function)<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Replace with your pipeline id\nSELECT * FROM event_log('&lt;PIPELINE_ID&gt;');\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">2) Create helpful views for reuse<\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Raw view (persisted in your schema)\nCREATE OR REPLACE VIEW dlt_event_log_raw AS\nSELECT * FROM event_log('&lt;PIPELINE_ID&gt;');\n\n-- Latest update per dataset\nCREATE OR REPLACE TEMP VIEW dlt_latest_updates AS\nSELECT\n  id:flow_definition.output_dataset AS dataset_name,\n  MAX(timestamp)                     AS last_update_ts\nFROM dlt_event_log_raw\nWHERE event_type = 'flow_progress'\nGROUP BY 1;\n<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\">3) Data Quality (DQ) metrics over time<\/h3>\n\n\n\n<p>DLT emits a nested <code>expectations<\/code> structure inside events (commonly in <code>flow_progress<\/code> and <code>expectations_status<\/code>). This query <strong>summarizes failures<\/strong> by dataset and rule name:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>WITH dq AS (\n  SELECT\n    timestamp,\n    dataset = COALESCE(\n      id:flow_definition.output_dataset::string,\n      details:flow_progress:output_dataset::string\n    ),\n    explode(expectations) AS exp\n  FROM dlt_event_log_raw\n  WHERE event_type IN ('flow_progress','expectations_status')\n),\nexp_flat AS (\n  SELECT\n    timestamp,\n    dataset,\n    exp.key                                  AS rule_name,\n    exp.value.passed_records::long           AS passed_records,\n    exp.value.failed_records::long           AS failed_records,\n    exp.value.status::string                 AS action -- ALLOW \/ DROP \/ FAIL\n  FROM dq\n)\nSELECT\n  dataset,\n  rule_name,\n  action,\n  SUM(passed_records) AS total_passed,\n  SUM(failed_records) AS total_failed,\n  MAX(timestamp)      AS last_seen\nFROM exp_flat\nGROUP BY dataset, rule_name, action\nORDER BY dataset, rule_name;\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>What you\u2019ll see:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong><code>action<\/code><\/strong> per rule (<code>ALLOW<\/code> ~ warning, <code>DROP<\/code>, <code>FAIL<\/code>)<\/li>\n\n\n\n<li><strong>passed\/failed<\/strong> counts and <strong>last_seen<\/strong> timestamp<\/li>\n\n\n\n<li>Use this to power <strong>SLO dashboards<\/strong> (e.g., \u201c&lt;1% failure over last 7 days\u201d)<\/li>\n<\/ul>\n<\/blockquote>\n\n\n\n<h3 class=\"wp-block-heading\">4) Pull the latest failure samples (optional)<\/h3>\n\n\n\n<p>To <strong>peek at failing rows<\/strong>, design your bronze with a <strong>quarantine table<\/strong> or <strong>DLT expectations with DROP<\/strong> + <strong>audit<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>If using <strong>DROP<\/strong>, write failing rows to a side sink (custom logic in bronze).<\/li>\n\n\n\n<li>Or, add <strong>trace columns<\/strong> (ingest time, source file, etc.) to speed triage.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">5) Operational health<\/h3>\n\n\n\n<p>Summarize <strong>task status<\/strong>, <strong>update durations<\/strong>, and <strong>throughput<\/strong>:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>-- Task runs and statuses\nSELECT\n  timestamp,\n  details:flow_progress:metrics:num_output_rows::long AS out_rows,\n  details:flow_progress:metrics:input_rows_processed::long AS in_rows,\n  details:flow_progress:status::string AS status,\n  id:flow_definition.output_dataset::string AS dataset\nFROM dlt_event_log_raw\nWHERE event_type = 'flow_progress'\nORDER BY timestamp DESC;\n<\/code><\/pre>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>Build visuals:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Bar<\/strong>: failed_records by rule_name over time<\/li>\n\n\n\n<li><strong>Line<\/strong>: throughput (rows\/sec) vs update<\/li>\n\n\n\n<li><strong>KPI<\/strong>: last update latency \/ time since last success<\/li>\n<\/ul>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Practical guidance &amp; patterns<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Start in Warning<\/strong> \u2192 learn the data \u2192 <strong>tighten<\/strong> rules over time \u2192 promote to <strong>Drop<\/strong> or <strong>Fail<\/strong> where appropriate.<\/li>\n\n\n\n<li><strong>Name rules<\/strong> predictably: <code>valid_&lt;col>_&lt;constraint><\/code> (<code>valid_order_price_gt_zero<\/code>).<\/li>\n\n\n\n<li><strong>Centralize rules<\/strong> (dicts or SQL macros) to <strong>reuse<\/strong> across datasets.<\/li>\n\n\n\n<li><strong>Post-join validations<\/strong> catch issues that only appear after enrichment.<\/li>\n\n\n\n<li><strong>Quarantine patterns<\/strong>: When using <strong>DROP<\/strong>, optionally <strong>write failures<\/strong> to a side table for remediation (e.g., with a stream that matches the inverse predicate).<\/li>\n\n\n\n<li><strong>Alerting<\/strong>: Schedule <strong>SQL alerts<\/strong> (e.g., failed_records > 0) or use <strong>Jobs<\/strong> to run health checks and send notifications (email\/Webhook\/Slack).<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Troubleshooting<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Wrong column name<\/strong> in a rule? DLT will <strong>fail early<\/strong> and show it in <strong>Event log<\/strong>. Fix the expression and re-run.<\/li>\n\n\n\n<li><strong>Nothing shows up<\/strong> in metrics? Ensure the dataset with expectations actually <strong>runs<\/strong> (new data or full refresh), and you\u2019re viewing <strong>the right pipeline\u2019s<\/strong> event log \/ catalog.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">One-page checklist<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Edition set to <strong>Advanced<\/strong><\/li>\n\n\n\n<li>Rule dicts defined and <strong>unit-tested<\/strong> on small slices<\/li>\n\n\n\n<li>Expectations applied at <strong>bronze<\/strong>, <strong>post-join<\/strong>, <strong>silver<\/strong> as needed<\/li>\n\n\n\n<li><strong>Warning<\/strong> for discovery, <strong>Drop<\/strong> to quarantine, <strong>Fail<\/strong> for contracts<\/li>\n\n\n\n<li><strong>Event log<\/strong> views created; <strong>DQ summary<\/strong> query saved<\/li>\n\n\n\n<li><strong>Dashboards<\/strong> and <strong>alerts<\/strong> wired to your SLOs<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Here\u2019s a complete, hands-on tutorial for DLT Data Quality &amp; Expectations \u2014 including how to define rules, use warning \/ fail \/ drop actions, and monitor a&#8230; <\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-834","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/834","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/comments?post=834"}],"version-history":[{"count":1,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/834\/revisions"}],"predecessor-version":[{"id":835,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/posts\/834\/revisions\/835"}],"wp:attachment":[{"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/media?parent=834"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/categories?post=834"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/dataopsschool.com\/blog\/wp-json\/wp\/v2\/tags?post=834"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}