Data Engineer Professional Certification
Domains & weightings from official documentation (updated 2025) (Databricks, Whizlabs).
Domain 1: Databricks Tooling (≈20%)
- Advanced use of platform tools: CLI, REST API, MLflow tracking integration
- Development workflows: notebooks, Repos, Asset Bundle (DAB), Databricks Connect
- Spark UI & performance diagnostics using monitoring GPUs, stages, storage tuning
Hands-on: Use CLI and REST to manage clusters and jobs; create Asset Bundle deployments; tune Spark jobs via Spark UI analytics.
Domain 2: Data Processing (≈30%)
- Complex ETL pipelines using Spark (Python/SQL), Delta Lake advanced features
- Performance tuning: partitioning, caching, broadcast joins, skew mitigation
- Structured streaming pipelines and batch coordination; fault tolerance
Hands-on: Build and tune streaming jobs; apply caching, broadcast joins; simulate skew and resolve it.
Domain 3: Data Modeling (≈20%)
- Designing lakehouse schemas: star, snowflake models, normalized vs denormalized
- Data partitioning strategies, schema evolution best practices
- Databricks-specific modeling patterns, Delta table optimization
Hands-on: Model a realistic star schema dataset, implement partitions, evolve schema.
Domain 4: Security & Governance (≈10%)
- Enterprise-level governance: Unity Catalog advanced configurations, secure clusters, workspace isolation
- Data encryption, ACLs on tables/views, governance policies
Hands-on: Configure secure cluster policies, manage encryption-at-rest and in-transit, assign complex ACLs.
Domain 5: Monitoring & Logging (≈10%)
- Logging frameworks, job-level logs, metrics collection, audit logs
- Setup alerting dashboards, monitoring dashboards for data pipeline performance
Hands-on: Enable and interpret job logs, create Databricks SQL dashboards for monitoring pipeline health, configure alerts.
Domain 6: Testing & Deployment (≈10%)
- Unit testing for Spark/SQL jobs; data quality validation; integration tests
- CI/CD pipelines: Git branching, automated deployments via Asset Bundles and jobs
- Version control, rollback strategies, Canary deployments
Hands-on: Write unit tests (e.g. pytest with Delta), simulate CI/CD with GitHub Actions or Azure DevOps, deploy via Asset Bundles.