dbutils is a built-in utility module in Databricks notebooks (Python, Scala, R) that provides programmatic access to common workspace tasks, including interacting with the Databricks File System (DBFS), handling secrets, controlling notebook workflow, and creating parameter widgets.
Core Features of dbutils
- File System Access:
Usedbutils.fs
to read, write, list, copy, and move files and directories in DBFS (Databricks File System), which abstracts cloud storage in Databricks. - Secrets Management:
Usedbutils.secrets
to securely retrieve sensitive credentials (like passwords, tokens, database keys) stored in secret scopes. - Notebook Workflow Control:
Usedbutils.notebook
for running other notebooks programmatically and returning results, enabling modular workflows. - Parameterization:
Usedbutils.widgets
to create input forms in notebooks, enabling dynamic, parameter-driven code. - Jobs Utility:
Usedbutils.jobs
to interact with job-specific metadata, like job IDs or run IDs. - Other Utilities:
Includes experimental modules likedbutils.data
for dataset interaction and some deprecated modules for library (package) management.
Example Usage in Python
python# List files in a DBFS directory
dbutils.fs.ls('/databricks-datasets')
# Get a secret value
secret_value = dbutils.secrets.get(scope='my-scope', key='my-key')
# Create a text input widget
dbutils.widgets.text("my_param", "default")
# Run another notebook from current notebook
result = dbutils.notebook.run("/Users/alice/my_notebook", 60, {"param1": "value"})
Important Notes
- Availability:
dbutils is only available in Databricks notebook environments connected to clusters that use DBFS. If working outside Databricks (such as in an IDE using Databricks Connect), only limited features are available (primarilyfs
,secrets
, andwidgets
via the Databricks SDK). - Importing in Custom Modules:
In Python files (outside notebooks), you may need to explicitly pass or instantiate dbutils usingfrom pyspark.dbutils import DBUtils
and a Spark session. - Limits & Deprecations:
Some submodules (likedbutils.library
) are deprecated in favor of%pip
for package management.
In summary:
dbutils is Databricks’ built-in toolset for notebook automation, workspace management, and facilitating data engineering tasks within the Databricks platform.
In Databricks, dbutils
is a utility library that comes pre-installed in the workspace and provides helper commands for common tasks you need to do inside notebooks or jobs — without having to write long Spark or Python code for them.
What dbutils
Is Used For
It’s basically Databricks’ Swiss Army knife — a collection of convenience functions for:
- File system operations (
dbutils.fs
) - Secrets management (
dbutils.secrets
) - Widgets for parameterizing notebooks (
dbutils.widgets
) - Notebook workflows (
dbutils.notebook
) - Library installation (
dbutils.library
) - Job/task utilities (
dbutils.jobs
) - Session info (
dbutils.help
,dbutils.notebook.exit
, etc.)
Main Modules in dbutils
Module | Purpose | Example |
---|---|---|
dbutils.fs | Manage files in DBFS (Databricks File System) | dbutils.fs.ls("/mnt/data") |
dbutils.secrets | Access secrets from a secret scope | dbutils.secrets.get("scope-name", "key-name") |
dbutils.widgets | Create and read notebook input parameters | dbutils.widgets.text("param1", "default") |
dbutils.notebook | Run other notebooks or exit with a value | dbutils.notebook.run("child_notebook", 60) |
dbutils.library | Install/uninstall libraries (cluster-scoped) | dbutils.library.installPyPI("pandas") |
dbutils.jobs | Get info about job/task context | dbutils.jobs.taskValues.set(key, value) |
dbutils.help() | Lists all available dbutils commands | dbutils.help() |
Example Usage
# List files in DBFS path
dbutils.fs.ls("/databricks-datasets")
# Create a text widget for parameters
dbutils.widgets.text("input_path", "/mnt/data")
param_value = dbutils.widgets.get("input_path")
# Read a secret
api_key = dbutils.secrets.get("my-scope", "api-key")
# Run another notebook and capture output
result = dbutils.notebook.run("process_data", 300, {"path": param_value})
Key Notes
- Runs only inside Databricks notebooks or jobs (not in local Python environments).
- Some features require entitlements (e.g., Secrets API requires secret scope setup).
dbutils
is workspace-specific — can differ slightly between versions.