Databricks: Service Principal in Databricks using Azure?

What Is a Service Principal in Databricks?

A service principal is a specialized, non-human identity within Azure Databricks, designed exclusively for automation, integrations, and programmatic access. Service principals are intended for use by tools, scripts, CI/CD pipelines, or external systems—never by individual users. They provide API-only access to Databricks resources, which increases security and stability by decoupling permissions from user accounts.

Key Features

  • Security: No risk of workflow interruptions when users change roles or leave the organization.
  • Fine-grained Access: Can be granted specific entitlements (e.g., workspace access, SQL access) or admin roles.
  • API-Only: Cannot log into the Databricks UI directly.

Use Cases

At the Databricks Account Console Level

  • Global automation across multiple workspaces (e.g., create workspaces, assign users/groups, manage Unity Catalog, auditing, and workspace configurations).
  • Central identity for CI/CD pipelines, Terraform/Pulumi scripts, or admin task automations that span all organizational Databricks resources.

At the Databricks Workspace Level

  • Manage and automate workspace resources (clusters, jobs, notebooks).
  • Programmatic data access and ingest, including API access to tables, Delta Lake resources, and job runs.
  • Secure credential for data engineering pipelines or scheduled jobs that need persistent, stable permissions.
  • Running jobs “as service principal” so workflows don’t fail if a user account changes or is removed.

How to Use Service Principal: Step-by-Step with cURL

Prerequisites:

  • You must be an account or workspace admin.
  • You need a registered service principal with appropriate roles/entitlements.

1. Create/Assign Service Principal

Account Console

  • Log into the Databricks Account Console.
  • Go to “User management” > “Service principals” > “Add service principal”, enter details, and add.

Workspace

  • Go to Workspace UI > Settings > Identity and Access > Manage > Add Service Principal.

2. Grant Permissions and Generate Token/Secret

  • Assign roles (User/Manager) and required entitlements.
  • Generate OAuth secret or Personal Access Token (PAT) for API usage.

3. Authenticate with cURL for Databricks REST APIs

Example: Create a Personal Access Token for Service Principal

bashcurl -X POST \
  https://<databricks-instance>/api/2.0/token-management/on-behalf-of/tokens \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer <ADMIN_PERSONAL_ACCESS_TOKEN>" \
  --data '{
     "principal": "<service-principal-id>",
     "comment": "Token for service principal automation"
   }'

You need an admin token or OAuth for initial API access. The returned token is your service principal’s API credential.

Example: Use Service Principal to List Databricks Jobs

(Assume <SP_PAT> is the token generated for the service principal)

bashcurl -X GET \
  https://<databricks-instance>/api/2.1/jobs/list \
  --header "Authorization: Bearer <SP_PAT>"

4. Create and Use Storage Credential (Advanced Example)

For Unity Catalog or storage integration, you may need to create a storage credential with service principal for secure access.

bashcurl -X POST \
  https://<databricks-instance>/api/2.1/unity-catalog/storage-credentials \
  -d '{
    "name": "sp-credential",
    "azure_service_principal": {
      "directory_id": "<tenant-id>",
      "application_id": "<sp-client-id>",
      "client_secret": "<sp-client-secret>"
    },
    "skip_validation": false
  }'

This sets up data access using the service principal identity.


Summary Table: Service Principal Use Cases

LevelUse Case Examples
Account ConsoleWorkspace automation, global governance, CI/CD
WorkspaceData access, job automation, scheduled pipelines

To use service principals in Databricks:

  1. Register and assign them at account or workspace level.
  2. Grant relevant permissions/entitlements.
  3. Generate a token for API authentication.
  4. Execute REST API calls securely with cURL—ideal for automation, integration, and stable orchestration of Databricks resources.

Related Posts

Complete Learning Path for MLOps Foundation Certification and Modern Reliability Practices

Introduction Machine Learning Operations is the critical bridge between data science experimentation and reliable production software. The MLOps Foundation Certification provides a structured approach for engineers to…

Read More

Your Ultimate Certified AIOps Manager Roadmap for IT Operations Leadership

Introduction The Certified AIOps Manager is a professional designation designed to bridge the gap between traditional IT service management and the era of autonomous, data-driven operations. This…

Read More

Secure Your IT Career with AIOps Architect Skills to Achieve Professional Growth

Introduction The modern engineering landscape is shifting from manual intervention to autonomous operations. The Certified AIOps Architect program is designed for professionals who want to bridge the…

Read More

Expert Certified AIOps Professional Roadmap for Building Intelligent Automation Driven Careers

Introduction Getting a Certified AIOps Professional credential is a major step for any engineer looking to stay ahead in the modern tech world. This guide is written…

Read More

Certified AIOps Engineer Guide for DevOps and IT Professionals

Introduction The role of a Certified AIOps Engineer is no longer a niche luxury; it is a fundamental requirement for any organization running high-scale, cloud-native environments. As…

Read More

Strategic AIOps Foundation Certification Path for Modern DevOps Career Growth

The modern technology landscape is shifting from manual, reactive operations toward intelligent, data-driven automation. This guide is designed for professionals who want to understand the AIOps Foundation…

Read More

Leave a Reply