1. 🔹 Introduction to Workflows
- Databricks Workflow (Job) = a pipeline of tasks (notebooks, scripts, SQL, pipelines, etc.).
 - Use cases: ETL orchestration, data quality checks, ML pipelines, conditional branching.
 - Each job = multiple tasks with dependencies, parameters, retries, schedules, etc.
 
2. 🔹 Jobs UI Overview
When creating a job:
- Task types: Notebook, Python script, Wheel, JAR, SQL, dbt, Spark submit, If/Else, For Each.
 - Cluster: Use job clusters (terminate after run) or all-purpose clusters.
 - Parameters: Pass values via widgets (
dbutils.widgets.get()in notebooks). - Notifications: Configure success/failure emails or alerts.
 - Retries & Timeouts: Control job resiliency.
 - Schedule/Trigger: Run once, on schedule, or triggered by events.
 - Permissions: Control who can run/edit/manage jobs.
 - Advanced: Queueing, max concurrent runs.
 
3. 🔹 Creating a Job (Example: Process Employee Data)
Workflow:
- Get Day (extract current day name from date).
 - Check if Sunday (If/Else branch).
 - If True → Process data by department.
 - If False → Print “Not Sunday”.
 
Notebook Setup
- Notebook 1: Get Day 
dbutils.widgets.text("input_date", "") input_date = dbutils.widgets.get("input_date") # Get day of week input_day = spark.sql(f""" SELECT date_format(to_timestamp('{input_date}', "yyyy-MM-dd'T'HH:mm:ss"), 'E') as day """).collect()[0].day # Set task value dbutils.jobs.taskValues.set(key="input_day", value=input_day) - Notebook 2: Process Data (department-based ETL).
 - Notebook 3: Else branch (just print day). 
input_day = dbutils.jobs.taskValues.get(taskKey="01_set_day", key="input_day") print(f"Today is {input_day}, skipping processing.") 
4. 🔹 Passing Values Between Tasks
- Use 
dbutils.jobs.taskValues.set()in producer task. - Retrieve with 
dbutils.jobs.taskValues.get(taskKey, key)in consumer task. 
✅ Example: Pass “Sunday” check from Notebook 1 → If/Else task.
5. 🔹 Conditional (If/Else) Tasks
- Add If/Else task in Workflow.
 - Condition:
- Value from task output = “Sunday”.
 - Operator = equals.
 
 - True branch → Run process notebook.
 - False branch → Run else notebook.
 
6. 🔹 Re-run Failed Jobs
- Go to Run history → Repair run.
 - Select failed tasks → re-execute only those.
 - Saves time (no need to rerun entire pipeline).
 
7. 🔹 Override Parameters at Runtime
- Use Run with different parameters in UI.
 - Example: Override 
input_dateto"2024-10-27T13:00:00"→ Forces workflow to evaluate as Sunday. 
8. 🔹 For Each Loop in Workflows
- Wrap a task (e.g., Process Department Data) inside a For Each loop.
 - Provide array of values (static or dynamic). 
["sales", "office"] - Each loop iteration passes one value → notebook parameter.
 
Example notebook parameter setup:
dbutils.widgets.text("department", "")
department = dbutils.widgets.get("department")
print(f"Processing department: {department}")
💡 This runs the same task multiple times (parallel/sequential) for each department.
9. 🔹 Best Practices
- ✅ Always use job clusters (auto-terminate) → cost saving.
 - ✅ Centralize parameters at job level, override at task level when needed.
 - ✅ Use taskValues for cross-task communication.
 - ✅ Use If/Else for conditional ETL or SLA workflows.
 - ✅ Use For Each for department-wise ETL, multi-source ingestion, or model training per dataset.
 - ✅ Leverage repair runs instead of restarting full pipelines.
 
10. 🔹 Summary
- Jobs orchestrate pipelines.
 - Tasks define execution units (Notebook, Python, SQL, etc.).
 - Parameters & TaskValues allow passing dynamic values.
 - If/Else = branch logic.
 - For Each = loop logic.
 - Repair Runs = selective reruns.
 - Override Params = test/debug flexibility.
 
This makes Databricks Workflows a lightweight orchestrator (similar to Airflow but native inside Databricks).