What Is a Service Principal in Databricks?
A service principal is a specialized, non-human identity within Azure Databricks, designed exclusively for automation, integrations, and programmatic access. Service principals are intended for use by tools, scripts, CI/CD pipelines, or external systems—never by individual users. They provide API-only access to Databricks resources, which increases security and stability by decoupling permissions from user accounts.
Key Features
- Security: No risk of workflow interruptions when users change roles or leave the organization.
- Fine-grained Access: Can be granted specific entitlements (e.g., workspace access, SQL access) or admin roles.
- API-Only: Cannot log into the Databricks UI directly.
Use Cases
At the Databricks Account Console Level
- Global automation across multiple workspaces (e.g., create workspaces, assign users/groups, manage Unity Catalog, auditing, and workspace configurations).
- Central identity for CI/CD pipelines, Terraform/Pulumi scripts, or admin task automations that span all organizational Databricks resources.
At the Databricks Workspace Level
- Manage and automate workspace resources (clusters, jobs, notebooks).
- Programmatic data access and ingest, including API access to tables, Delta Lake resources, and job runs.
- Secure credential for data engineering pipelines or scheduled jobs that need persistent, stable permissions.
- Running jobs “as service principal” so workflows don’t fail if a user account changes or is removed.
How to Use Service Principal: Step-by-Step with cURL
Prerequisites:
- You must be an account or workspace admin.
- You need a registered service principal with appropriate roles/entitlements.
1. Create/Assign Service Principal
Account Console
- Log into the Databricks Account Console.
- Go to “User management” > “Service principals” > “Add service principal”, enter details, and add.
Workspace
- Go to Workspace UI > Settings > Identity and Access > Manage > Add Service Principal.
2. Grant Permissions and Generate Token/Secret
- Assign roles (User/Manager) and required entitlements.
- Generate OAuth secret or Personal Access Token (PAT) for API usage.
3. Authenticate with cURL for Databricks REST APIs
Example: Create a Personal Access Token for Service Principal
bashcurl -X POST \
https://<databricks-instance>/api/2.0/token-management/on-behalf-of/tokens \
--header "Content-Type: application/json" \
--header "Authorization: Bearer <ADMIN_PERSONAL_ACCESS_TOKEN>" \
--data '{
"principal": "<service-principal-id>",
"comment": "Token for service principal automation"
}'
You need an admin token or OAuth for initial API access. The returned token is your service principal’s API credential.
Example: Use Service Principal to List Databricks Jobs
(Assume <SP_PAT>
is the token generated for the service principal)
bashcurl -X GET \
https://<databricks-instance>/api/2.1/jobs/list \
--header "Authorization: Bearer <SP_PAT>"
4. Create and Use Storage Credential (Advanced Example)
For Unity Catalog or storage integration, you may need to create a storage credential with service principal for secure access.
bashcurl -X POST \
https://<databricks-instance>/api/2.1/unity-catalog/storage-credentials \
-d '{
"name": "sp-credential",
"azure_service_principal": {
"directory_id": "<tenant-id>",
"application_id": "<sp-client-id>",
"client_secret": "<sp-client-secret>"
},
"skip_validation": false
}'
This sets up data access using the service principal identity.
Summary Table: Service Principal Use Cases
Level | Use Case Examples |
---|---|
Account Console | Workspace automation, global governance, CI/CD |
Workspace | Data access, job automation, scheduled pipelines |
To use service principals in Databricks:
- Register and assign them at account or workspace level.
- Grant relevant permissions/entitlements.
- Generate a token for API authentication.
- Execute REST API calls securely with cURL—ideal for automation, integration, and stable orchestration of Databricks resources.