Databricks: Medallion Architecture in Data Lakehouse

Here’s a step-by-step tutorial with deep explanations + examples: 📘 Medallion Architecture in Data Lakehouse (Bronze, Silver, Gold Layers with Databricks) 1. 🔹 Introduction In a Data…

Read More

Databricks: Databricks Auto Loader Tutorial

🚀 Databricks Auto Loader Tutorial (with Schema Evolution Modes & File Detection Modes) Auto Loader in Databricks is the recommended way to ingest files incrementally and reliably…

Read More

Databricks: Databricks COPY INTO Command – Idempotent & Exactly-Once Data Loading

1. 🔹 What is COPY INTO? 👉 For millions of files or complex directories, use Autoloader instead. 2. 🔹 Setup: Managed Volume & Input Files Now we…

Read More

Databricks: Databricks Workflows (Jobs, Tasks, Passing Values, If/Else, Re-runs, and Loops)

1. 🔹 Introduction to Workflows 2. 🔹 Jobs UI Overview When creating a job: 3. 🔹 Creating a Job (Example: Process Employee Data) Workflow: Notebook Setup 4….

Read More

Databricks: Custom Cluster Policies & Instance Pools in Databricks

Perf 1. 🔹 Why Policies and Pools? These features are critical in enterprise Databricks deployments to enforce compliance, control costs, and improve performance. 2. Custom Cluster Policies…

Read More

Databricks: Databricks Compute (Clusters, Access Modes, Policies, and Permissions)

1. What is Compute in Databricks? 2. Types of Compute in Databricks 🔹 All-Purpose Compute 🔹 Job Compute 🔹 Serverless Compute (coming in preview/GA by region) 3….

Read More

Orchestrating and Scheduling Notebooks in Databricks

Perfect — this transcript is about Databricks Notebook Orchestration and how to parameterize/run 1. Introduction Databricks notebooks can be parameterized and orchestrated like workflows.You can: 2. Setup:…

Read More

Databricks: Databricks Utilities (dbutils) – Complete Guide

🔹 1. Introduction In Databricks, you often need to interact with: 👉 For these tasks, Databricks Utilities (dbutils) provide built-in helpers. Key points: 🔹 2. What is…

Read More

Databricks: Using Volumes in Databricks with Unity Catalog

🔹 1. Introduction In Databricks, we usually store tabular data in Delta tables (structured data).But what about: 👉 For these, Databricks introduces Volumes, which provide a governed,…

Read More

Databricks: Delta Tables – Deletion Vectors & Liquid Clustering

Delta Lake keeps improving with features that optimize performance and storage. Two of the most important recent features are: Let’s explore both in detail with examples you…

Read More

Databricks: Delta Tables MERGE & UPSERT (SCD1 + Soft Deletes)

This tutorial covers how to perform upserts (MERGE) in Delta tables on Databricks, with both hard deletes and soft deletes (using SCD1 style). 1. 🔹 Introduction In…

Read More

Databricks: Delta Tables, Catalogs, Views, and Clones

This tutorial will walk you through core Delta Lake functionality in Databricks, including catalogs, schemas, tables, views, CTAS, deep clone, and shallow clone. Each section is backed…

Read More

Databricks – Catalog, Schemas & Tables with External Location

this is exactly the core of Unity Catalog’s object model. The way Databricks resolves storage paths for managed tables depends on where you attach the external/managed location….

Read More

Databricks Lab – Managed vs External Tables + UNDROP (with External Location setup)

Databricks Unity Catalog Tutorial Managed vs External Tables + UNDROP (with External Location setup) Introduction (what we’ll build) You’ll learn to: What’s new in Databricks? (Updates &…

Read More

Databricks Lab – Working with Schemas and External Locations

We will: Unity Catalog has a 4-level hierarchy: Metastore → Catalog → Schema → Table 👉 Today we’ll create three schemas to see how Unity Catalog stores…

Read More

Databricks Lab – Catalog with External Location, & Storage Credentials in Unity Catalog

Good Read – https://dataopsschool.com/blog/databricks-catalog-schemas-tables-with-external-location/ 1. Create Catalog without External Location 2. Create Catalog with SQL 3. Drop Catalog and Drop Catalog Recursively 4. Create External Location in…

Read More

Databricks: Unity Catalog vs Catalogs vs Workspace vs Metastore

🔑 Unity Catalog vs Catalogs vs Workspace vs Metastore 1. Unity Catalog (UC) ✅ 👉 Analogy: National Library System – it governs all libraries in a country….

Read More

Databricks Components

Databricks Components Hierarchy 1. Account Level (Top Layer) 2. Governance & Data Management 3. Computation & Execution 4. Developer Interfaces 5. Data & AI Layers ✅ In…

Read More

Tutorial: Data Democratization in the Context of DataOps

1. Introduction & Overview What is Data Democratization? Data Democratization is the process of making data accessible, understandable, and usable to everyone in an organization—without requiring deep…

Read More