Back to Programs
Ripotek logo

Ripotek Technologies Inc.

Design. Engineer. Deliver.

Calgary, Alberta

www.ripotek.com

training@ripotek.com

Databricks Engineer

Professional Training Program Syllabus

Program Overview

Duration

24 Weeks (3 sessions per week)

72 total sessions, 216 instructional hours

Investment

CAD $1,500

Flexible payment plans available

Schedule

Monday/Wednesday/Saturday

6:00 PM - 9:00 PM Mountain Time

Certification Prep

Databricks Certified

Data Engineer Associate & Professional

Program Description

The Databricks Engineer program is a comprehensive 24-week training designed to master Apache Spark, Delta Lake, and the Databricks Unified Analytics Platform. This intensive program covers distributed computing fundamentals, advanced PySpark programming, Delta Lake architecture, medallion design patterns, and production-grade orchestration.

You'll build real-world data pipelines, implement data quality frameworks, optimize Spark jobs, and prepare for Databricks certification exams. By completion, you'll have a portfolio demonstrating expertise in building scalable, enterprise-grade data lakehouse solutions.

Prerequisites

Required

  • Strong Python programming skills
  • SQL proficiency (queries, joins, window functions)
  • Understanding of data engineering concepts
  • Familiarity with cloud platforms (Azure/AWS/GCP)
  • Basic understanding of distributed systems

Recommended

  • Experience with ETL/ELT processes
  • Data warehouse or data lake experience
  • Git and version control knowledge
  • Apache Spark awareness (helpful but not required)

Learning Outcomes

Upon successful completion of this program, you will be able to:

Write advanced PySpark code for data processing
Build Delta Lake pipelines with medallion architecture
Optimize Spark jobs for performance and cost
Implement CDC and streaming workloads
Design data quality and testing frameworks
Orchestrate workflows with Databricks Jobs
Manage Unity Catalog for data governance
Implement CI/CD for Databricks projects
Work with Delta Live Tables
Monitor and troubleshoot production pipelines
Integrate machine learning workflows
Pass Databricks certification exams

Curriculum Highlights

Phase 1: Spark Fundamentals (Weeks 1-6)

  • Spark architecture and execution model
  • RDDs, DataFrames, and Datasets
  • Transformations and actions
  • PySpark DataFrame API mastery
  • Reading/writing multiple formats
  • SQL on Spark and catalog integration

Phase 2: Delta Lake Architecture (Weeks 7-12)

  • Delta Lake transaction log
  • ACID transactions and time travel
  • Medallion architecture (Bronze/Silver/Gold)
  • MERGE, UPDATE, DELETE operations
  • Schema evolution and enforcement
  • Optimization techniques (OPTIMIZE, Z-ORDER)

Phase 3: Performance and Optimization (Weeks 13-16)

  • Partitioning strategies
  • Broadcast joins and shuffle optimization
  • Caching and persistence strategies
  • Spark UI and query plans
  • Adaptive Query Execution (AQE)
  • Cluster sizing and autoscaling

Phase 4: Streaming and Advanced Patterns (Weeks 17-20)

  • Structured Streaming fundamentals
  • Windowing and watermarks
  • Change Data Capture (CDC) patterns
  • Delta Live Tables (DLT)
  • Data quality expectations
  • Event-driven architectures

Phase 5: Production Engineering (Weeks 21-24)

  • Unity Catalog and data governance
  • Workflow orchestration and jobs
  • CI/CD with Databricks Asset Bundles
  • Monitoring and alerting
  • Security best practices
  • Capstone project and certification prep

Assessment and Grading

Assessment ComponentWeightDescription
Phase Projects (5)40%One project per phase
Weekly Labs20%Hands-on coding exercises
Participation10%Class engagement and peer reviews
Capstone Project30%End-to-end lakehouse solution

Grading Scale

A
90-100%
B
80-89%
C
70-79%
F
Below 70%

A minimum grade of 70% is required to receive a certificate of completion

Materials and Resources

Required Tools

  • Databricks Workspace
    Community or trial edition
  • Python 3.8+
    Local development environment
  • Visual Studio Code
    With Python and Databricks extensions
  • Git
    Version control

Provided Resources

  • Course Materials
    Notebooks, code samples, datasets
  • Cloud Credits
    For lab exercises
  • Certification Voucher
    Databricks certification exam
  • Practice Exams
    Certification preparation

Career Services and Job Placement

Databricks engineers are highly sought after in enterprise organizations undergoing cloud and data modernization initiatives. Our graduates secure senior-level positions with competitive compensation.

Career Support Includes:

  • Advanced career coaching
  • GitHub portfolio development
  • System design interview prep
  • Direct introductions to hiring companies
  • Salary negotiation coaching

Typical Job Titles:

  • Databricks Engineer
  • Spark Data Engineer
  • Lakehouse Architect
  • Big Data Engineer
  • Senior Data Platform Engineer
85%
Placement Rate
90 days
Average Time
$95K+
Avg Starting Salary