School of Data & AIBeginner to Intermediate

Data Engineering

Build the pipelines and data systems that analytics and AI teams depend on.

Learn Python, advanced SQL, data modeling, warehousing, ETL, ELT, Airflow, dbt, cloud workflows, and data engineering projects that turn raw data into trusted business-ready systems.

Compare support options

Target role

Data Engineer

Duration

Flexible duration - Flexible weekly pace

Course sequence

8 courses

Support model

Choose your learning support level

Outcome

Built around a clear role target.

Data Engineer

Learners who want a structured route across connected courses

Course sequence

See how the courses build into the full path.

Each course has a focused job, but the value compounds when you follow the sequence, complete the projects, and use the support model around the full path.

School of Data & AIBeginner to Intermediate8 courses

Data Engineering

Learn how to build the pipelines, data models, warehouses, orchestration workflows, and cloud data systems that power analytics, reporting, machine learning, and AI products.

Target role

Data Engineer

Duration

Flexible duration - Flexible weekly pace

Support

Choose your learning support level

Compare support
  1. 1
    Path only2 weeksBeginner

    Data Foundations

    Build the essential foundation for working with data, understanding business problems, and preparing for tools like Excel, SQL, Python, Power BI, machine learning, AI, and data engineering.

    Understand how data is used to solve real business problems.
    Available through the path so the work stays connected to the full outcome.
    Understand how data is used to solve real business problems.Explain the difference between data, reports, dashboards, insights, and decisions.Understand the major roles across analytics, data science, AI, and data engineering.Identify common tools used by modern data teams.
    View course outline
  2. 2
    Path only7 weeksBeginner to Intermediate

    Python for Data Engineering

    Learn the Python skills data engineers use to move, clean, transform, validate, and automate data across files, APIs, databases, and pipeline workflows.

    Write Python scripts for data engineering tasks.
    Available through the path so the work stays connected to the full outcome.
    Write Python scripts for data engineering tasks.Read, write, and process CSV, JSON, Excel, and structured data files.Extract data from APIs and external sources.Clean, transform, and validate data with Python.
    View course outline
  3. 3
    Standalone + path7 weeksBeginner to Intermediate

    SQL for Data Analytics

    Learn the SQL skills data analysts use to extract, filter, join, group, and analyze data from relational databases.

    Understand tables, columns, rows, keys, and relationships.
    Can be started alone, then compounded inside the full path.
    Understand tables, columns, rows, keys, and relationships.Write SQL queries to retrieve business data.Filter, sort, and structure query results.Join data across multiple tables correctly.
    View course
  4. 4
    Path only7 weeksIntermediate

    Data Warehousing

    Learn how modern teams organize trusted business data for analytics, reporting, and decision-making.

    Understand what a data warehouse is and why teams use it.
    Available through the path so the work stays connected to the full outcome.
    Understand what a data warehouse is and why teams use it.Explain the difference between operational databases and analytical warehouses.Understand warehouse layers such as staging, transformation, and reporting.Design fact and dimension tables.
    View course outline
  5. 5
    Path only8 weeksIntermediate

    ETL & ELT Pipelines

    Learn how to move data from source systems into databases, warehouses, and analytics layers using practical ETL and ELT pipeline workflows.

    Understand ETL and ELT pipeline workflows.
    Available through the path so the work stays connected to the full outcome.
    Understand ETL and ELT pipeline workflows.Extract data from files, APIs, and databases.Load data into databases and warehouse-style systems.Transform raw data into clean, structured datasets.
    View course outline
  6. 6
    Path only8 weeksIntermediate

    Orchestration with Airflow & dbt

    Learn how to manage reliable data workflows using Airflow for orchestration and dbt for structured, tested, analytics-ready transformations.

    Understand data orchestration and workflow scheduling.
    Available through the path so the work stays connected to the full outcome.
    Understand data orchestration and workflow scheduling.Build Airflow workflows with tasks and dependencies.Schedule and monitor pipeline runs.Handle retries, failures, and workflow visibility.
    View course outline
  7. 7
    Path only8 weeksIntermediate

    Cloud Data Engineering

    Learn how data engineering works in the cloud, including storage, compute, managed databases, warehouses, pipeline deployment, access control, monitoring, and cost-aware architecture.

    Understand how data engineering works in cloud environments.
    Available through the path so the work stays connected to the full outcome.
    Understand how data engineering works in cloud environments.Use cloud storage concepts for raw and processed data.Understand managed databases and warehouse-style services.Choose basic compute options for data workloads.
    View course outline
  8. 8
    Path only8 weeksIntermediate

    Data Engineering Studio

    Apply Python, SQL, data modeling, warehousing, pipelines, orchestration, dbt, and cloud workflows to build portfolio-ready data engineering projects.

    Plan complete data engineering projects.
    Available through the path so the work stays connected to the full outcome.
    Plan complete data engineering projects.Extract data from files, APIs, databases, and source systems.Clean, validate, and transform raw datasets.Design data models for analytics and reporting.
    View course outline
Curriculum

Follow the courses in sequence.

The path moves toward Data Engineer through complete course outlines, from phases and modules down to lesson page topics.

1Beginner2 weeksPath onlyData FoundationsBuild the essential foundation for working with data, understanding business problems, and preparing for tools like Excel, SQL, Python, Power BI, machine learning, AI, and data engineering.6 phases7 modules28 lessons88 pages
1Phase 1 - Introduction to Data & AIIntroduce data, organizational data use, and the modern data ecosystem.1 modules3 lessons1 week
Module 1: The World of DataUnderstand what data is, how organizations use it, and how the modern data ecosystem works.3 lessons
Lesson 1: What Is Data?Understand data as recorded facts, observations, events, and signals that can be interpreted to support decisions.75 minarticle5 pages

Welcome and Learning Objectives

Start the lesson and understand the purpose of learning data foundations.

8 min

Data in Plain English

Explain what data is and why context matters.

15 min

Structured, Semi-Structured, and Unstructured Data

Classify the main forms of data students will meet in analytics, data science, AI, and data engineering.

18 min

Data Sources in Everyday Life

Help students see that data is created constantly by normal activities.

15 min

Exercise - Daily Data Source Audit

Students identify real data sources around them and classify them.

19 min

Lesson 2: How Organizations Use DataExplore how businesses and institutions use data for decisions, reporting, forecasting, optimization, and automation.80 minarticle5 pages

Welcome and Learning Objectives

Introduce organizational data use.

8 min

The Five Common Uses of Data

Explain common ways data supports organizations.

22 min

Case Study - Netflix Recommendations

Introduce recommendations as a beginner-friendly example of organizational data use.

20 min

Data Usage Map

Show how one organization can use many kinds of data.

15 min

Exercise - Company Data Usage Map

Students map how data may be used in a real organization.

15 min

Lesson 3: The Modern Data EcosystemUnderstand how data moves from sources into databases, warehouses, dashboards, models, AI systems, and decisions.80 minarticle5 pages

Welcome and Learning Objectives

Introduce the modern data ecosystem.

8 min

The Main Components

Explain core components in the ecosystem.

22 min

Complete Data Flow Example

Show a realistic beginner-friendly data flow.

20 min

Diagram Exercise - Draw Complete Data Flow

Students create a data ecosystem diagram.

20 min

Module Summary

Summarize the first module and prepare students for data roles.

10 min

2Phase 2 - Data Careers & RolesHelp students understand the major data and AI career paths before choosing a specialization.1 modules5 lessons1 week
Module 1: Understanding the Data ProfessionCompare the responsibilities, deliverables, tools, and thinking patterns across modern data roles.5 lessons
Lesson 1: Data AnalystUnderstand what data analysts do, the problems they solve, and the deliverables they create.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 2: Data ScientistUnderstand how data scientists use statistics, programming, and models to explore patterns and make predictions.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 3: Data EngineerUnderstand how data engineers build pipelines, warehouses, and infrastructure that make data usable.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 4: AI EngineerUnderstand AI engineering at a high level, including LLMs, RAG, agents, and AI product systems.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 5: Choosing Your PathUse a practical decision framework to choose a data or AI learning path.60 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

30 min

Practice Activity

Apply the concept through a guided activity.

22 min

3Phase 3 - Data ThinkingTeach students how to ask better questions, define useful metrics, investigate causes, and communicate insights.1 modules4 lessons1 week
Module 1: Analytical ThinkingBuild the thinking habits that separate tool users from real data professionals.4 lessons
Lesson 1: Questions Before AnswersLearn to define problems and ask useful analytical questions before touching tools.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Lesson 2: Metrics and KPIsUnderstand metrics, KPIs, vanity metrics, and actionable measures.55 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

27 min

Practice Activity

Apply the concept through a guided activity.

20 min

Lesson 3: Root Cause AnalysisLearn how to investigate business problems instead of jumping to shallow conclusions.55 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

27 min

Practice Activity

Apply the concept through a guided activity.

20 min

Lesson 4: Data StorytellingLearn how to communicate insights through clear narratives and decision-focused reporting.55 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

27 min

Practice Activity

Apply the concept through a guided activity.

20 min

4Phase 4 - Working with DataDevelop core data literacy: data types, data quality, cleaning concepts, and exploratory analysis.1 modules4 lessons1 week
Module 1: Data LiteracyUnderstand datasets, quality issues, cleaning concepts, and beginner exploration.4 lessons
Lesson 1: Data TypesClassify numeric, categorical, time-series, and text data.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 2: Data QualityIdentify missing values, duplicates, inconsistencies, and their business impact.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Lesson 3: Data Cleaning ConceptsUnderstand validation, standardization, and transformation at a beginner level.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Lesson 4: Exploratory AnalysisLearn how to explore patterns, trends, outliers, and initial questions in a dataset.55 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

27 min

Practice Activity

Apply the concept through a guided activity.

20 min

5Phase 5 - Statistics for Decision MakingTeach practical, business-focused statistics without heavy mathematics.1 modules4 lessons1 week
Module 1: Practical StatisticsUse basic statistics to summarize data and support better decisions.4 lessons
Lesson 1: Descriptive StatisticsUnderstand mean, median, mode, variance, and how they describe business data.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Lesson 2: Probability ConceptsUnderstand uncertainty, risk, and likelihood using business examples.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 3: Correlation vs CausationAvoid common mistakes and misleading conclusions when variables move together.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Lesson 4: Making Decisions with DataUse confidence, evidence, and tradeoffs to make better business decisions.55 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

27 min

Practice Activity

Apply the concept through a guided activity.

20 min

6Phase 6 - Data, AI & EthicsIntroduce responsible data use, privacy, bias, AI ethics, and the future of data work.2 modules8 lessons1 week
Module 1: Responsible Data UseBuild responsible habits around privacy, bias, fairness, transparency, and AI accountability.4 lessons
Lesson 1: Data PrivacyUnderstand personal data, consent, and compliance concepts at a beginner level.45 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

22 min

Practice Activity

Apply the concept through a guided activity.

15 min

Lesson 2: Bias in DataUnderstand how bias, fairness, and representation affect data conclusions.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Lesson 3: AI EthicsUnderstand hallucinations, transparency, accountability, and safe AI usage.55 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

27 min

Practice Activity

Apply the concept through a guided activity.

20 min

Lesson 4: Future of Data and AIExplore AI transformation, automation, emerging careers, and personal roadmap planning.50 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

25 min

Practice Activity

Apply the concept through a guided activity.

17 min

Module 2: Foundations Projects and GraduationPackage learning into mini projects, a final foundations project, and graduation requirements.4 lessons
Lesson 1: Mini Project 1 - Business Metrics AnalysisAnalyze a business domain and design practical metrics and recommendations.60 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

30 min

Practice Activity

Apply the concept through a guided activity.

22 min

Lesson 2: Mini Project 2 - Data Quality AuditAudit a messy dataset and produce a quality report.60 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

30 min

Practice Activity

Apply the concept through a guided activity.

22 min

Lesson 3: Final Foundations Project - Data-Driven Business AnalysisAnalyze a real company or product from a data perspective and present recommendations.90 minarticle3 pages

Overview and Learning Objectives

Start the lesson and understand what the student should be able to do.

8 min

Concepts and Examples

Introduce the main concepts with practical examples.

45 min

Practice Activity

Apply the concept through a guided activity.

37 min

Lesson 4: Graduation Requirements and Portfolio OutcomeClarify what students must complete and what they should have in their portfolio.40 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements.

40 min

2Beginner to Intermediate7 weeksPath onlyPython for Data EngineeringLearn the Python skills data engineers use to move, clean, transform, validate, and automate data across files, APIs, databases, and pipeline workflows.7 phases18 modules78 lessons230 pages
1Phase 1 - Python Foundations for Data EngineeringBuild Python foundations specifically for data engineering: pipeline mindset, environment setup, scripts, syntax, control flow, reusable logic, and error handling.2 modules9 lessons1–2 weeks
Module 1: Python for Data Engineering MindsetUnderstand Python's role in data engineering and set up a professional workspace for script-based workflows.4 lessons
Lesson 1: What Python Does in Data EngineeringUnderstand Python as the glue language for extraction, transformation, validation, automation, and pipeline reliability.85 minarticle5 pages

Welcome and Learning Objectives

Introduce Python's role in data engineering.

8 min

Python as Pipeline Glue

Explain why Python is used in data workflows.

18 min

Python vs SQL, BI, dbt, Airflow and Warehouses

Clarify tool responsibilities.

22 min

Where Python Fits in Analytics, AI and Data Platforms

Connect Python to later data engineering path courses.

18 min

Exercise - Workflow Tool Decision Matrix

Students decide which tools should handle parts of a workflow.

19 min

Lesson 2: Development Environment SetupSet up a professional Python data engineering workspace using Python, VS Code, terminal, virtual environments, pip, requirements files, and project folders.85 minarticle5 pages

Welcome and Learning Objectives

Introduce environment setup.

8 min

Python, VS Code and Terminal Basics

Explain the core tools.

20 min

Virtual Environments and Requirements

Explain dependency isolation.

20 min

Project Folder Structure

Introduce a simple data engineering layout.

18 min

Exercise - Python Data Engineering Workspace Setup

Students set up their workspace.

19 min

Lesson 3: Running Python ProgramsRun Python scripts from the terminal, understand command-line inputs, distinguish notebooks from scripts, and build a simple input-output program.85 minarticle5 pages

Welcome and Learning Objectives

Introduce script execution.

8 min

Scripts vs Notebooks

Explain when to use scripts and notebooks.

18 min

Running Scripts from Terminal

Teach basic execution flow.

20 min

Command-Line Inputs

Introduce input arguments conceptually.

18 min

Exercise - Input Output Script

Students create and run a simple program.

21 min

Lesson 4: Python Syntax EssentialsLearn variables, data types, strings, numbers, booleans, comments, naming conventions, and constants for pipeline configuration.80 minarticle4 pages

Welcome and Learning Objectives

Introduce syntax essentials.

8 min

Variables and Data Types

Explain basic Python values in data engineering context.

20 min

Comments, Naming and Constants

Teach readable syntax habits.

18 min

Exercise - Pipeline Configuration Variables

Students create simple configuration variables.

34 min

Module 2: Control Flow and Reusable LogicUse conditions, loops, functions, and error handling to build reusable pipeline logic.5 lessons
Lesson 1: Conditions for Data RulesUse if, elif, else, comparison operators, validation rules, and branching logic for record classification.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 2: Loops for Batch ProcessingUse for loops, while loops, file loops, record loops, and avoid inefficient loops.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 3: Functions for Pipeline LogicDesign reusable functions using parameters, returns, pure functions, side effects, and helpers.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Error HandlingUse try/except, common data errors, safe failure, error messages, fail-fast vs continue-safely strategies.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 5: Mini Project 1 - Data File ProcessorBuild a Python script that processes multiple CSV files and writes a processing summary.100 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Working with Files and Data FormatsRead, write, discover, parse, combine, and export files across text, CSV, JSON, Excel, logs, and DataFrames.3 modules13 lessons2 weeks
Module 1: File Systems and Data IngestionUse file paths, directories, batch processing, file metadata, and safe file movement.4 lessons
Lesson 1: File Paths and DirectoriesUse absolute paths, relative paths, pathlib, folders, file naming, and file discovery.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 2: Reading and Writing Text FilesUse open, read, write, append, encoding, and newline handling.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 3: Batch File ProcessingProcess folders, file loops, input/output folders, processed/archive folders, and avoid accidental overwrite.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: File MetadataCapture file size, created/modified time, extension, source system, batch date, and load time.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Module 2: Structured Data FormatsWork with CSV, JSON, Excel, and log/semi-structured formats.4 lessons
Lesson 1: CSV FilesUnderstand CSV structure, delimiters, headers, missing values, malformed rows, and encoding issues.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: JSON DataWork with JSON objects, arrays, nested JSON, API-style JSON, and flattening concepts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Excel FilesProcess multiple sheets, sheet names, inconsistent headers, and writing Excel outputs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Logs and Semi-Structured DataParse server/application logs, timestamps, patterns, and extract structured fields.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Module 3: DataFrames for Data EngineeringUse Pandas carefully for loading, schema inspection, combining files, and exporting outputs.5 lessons
Lesson 1: Pandas for Data EngineeringUse DataFrames for loading data, inspecting schema, data types, and memory awareness.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Schema AwarenessDetect expected columns, unexpected columns, missing columns, column ordering, data types, and schema drift.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Combining FilesConcatenate files, append daily batches, track source, use batch IDs, and manage duplicate risk.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Exporting DataExport CSV, JSON, Excel, partitioned outputs, and batch-date filenames.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 5: Milestone Project 1 - Multi-Format Ingestion PipelineBuild an ingestion pipeline for CSV, JSON, and Excel files with schema validation and reporting.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Data Transformation and ValidationClean, transform, validate, quarantine, and report on data quality in Python pipelines.3 modules14 lessons2 weeks
Module 1: Data Cleaning for Engineering WorkflowsClean common business data problems in text, dates, numeric fields, and identifiers.4 lessons
Lesson 1: Common Data Quality ProblemsIdentify missing values, duplicates, inconsistent categories, invalid dates, invalid numeric values, broken IDs, and out-of-range values.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Cleaning Text FieldsTrim whitespace, standardize casing, categories, special characters, names, and null-like strings.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Date and Time HandlingParse dates, handle timezone basics, date formats, invalid dates, year/month/day extraction, and batch dates.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Lesson 4: Numeric TransformationClean currency fields, percentages, negative values, rounding, invalid numeric strings, and type conversion.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Module 2: Data Transformation PatternsApply mapping, business rules, merges, aggregations, and incremental processing concepts.5 lessons
Lesson 1: Mapping and StandardizationUse lookup maps, category mapping, code mapping, region mapping, and product mapping.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Filtering and Business RulesApply active records, valid transactions, excluded statuses, date windows, and business filters.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 3: Joins and Merges in PythonMerge DataFrames, manage join keys, one-to-many issues, missing matches, and duplicate keys.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Lesson 4: AggregationsUse groupby counts, sums, averages, min/max, grouped outputs, and summary tables.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 5: Incremental Processing ConceptsUnderstand full load, incremental load, batch date, new records, changed records, and late-arriving data.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Module 3: Data Validation and Quality ChecksBuild validation rules, quality reports, rejection outputs, and framework-ready checklists.5 lessons
Lesson 1: Validation RulesCheck required fields, unique keys, accepted values, date ranges, numeric ranges, and foreign keys.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 2: Data Quality ReportsCreate row counts, null counts, duplicate counts, invalid records, warnings, and failure thresholds.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Quarantine Bad RecordsSeparate valid records, rejected records, rejection reasons, error files, and auditability.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Great Expectations and Validation Framework ConceptsUnderstand expectations, validation suites, automated checks, data contracts, and when frameworks help.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 5: Milestone Project 2 - Data Cleaning and Quality PipelineBuild a pipeline that cleans, validates, rejects bad records, reports quality, and writes clean outputs.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - Databases and SQL with PythonConnect Python to relational databases, read/write SQL data, load staged data, and reconcile pipeline results.3 modules13 lessons1–2 weeks
Module 1: Database Integration FoundationsUnderstand database use cases and connect Python safely to databases.4 lessons
Lesson 1: Why Data Engineers Use DatabasesCompare files, transactional databases, analytical databases, staging tables, raw/clean layers, and loading pipelines.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 2: Connecting Python to DatabasesUse connection strings, credentials, environment variables, connection safety, drivers, and SQLAlchemy concepts.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Lesson 3: Reading Data from SQLUse SELECT queries, read into DataFrames, query parameters, limits, and avoid full-table accidents.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Writing Data to SQLUse insert, append, replace, staging tables, bulk-load concepts, and data type mapping.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Module 2: Database Loading PatternsUse staging tables, upserts, audit columns, and safe write patterns.4 lessons
Lesson 1: Staging TablesUse staging tables for raw loads, validation after load, temporary tables, and audit columns.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Upserts and DeduplicationUnderstand insert vs update, natural keys, surrogate keys, duplicate handling, and conflict resolution.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Lesson 3: Audit Columns and Load TrackingAdd batch ID, source file, loaded_at, processed_at, record status, and error reason.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 4: Transactions and Safe WritesUse commits, rollbacks, partial failures, idempotency concepts, and safe reruns.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Module 3: Database Validation and ReconciliationValidate loads using row counts, key checks, freshness checks, and run summary tables.5 lessons
Lesson 1: Row Count ReconciliationCompare source count, loaded count, rejected count, and mismatch detection.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Duplicate and Key ChecksValidate primary keys, unique keys, duplicate records, and referential checks.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 3: Data Freshness ChecksCheck latest load date, missing batch, stale data, and source delays.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 4: Load Summary TablesStore pipeline run logs, status, record counts, duration, and errors.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 5: Milestone Project 3 - File-to-Database Loading PipelineBuild a pipeline that validates daily files, loads clean data into database tables, stores rejected records, and records load metadata.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - Building Batch Data PipelinesDesign batch pipelines with extraction, transformation, loading, configuration, idempotency, logging, and run reports.3 modules13 lessons2 weeks
Module 1: Pipeline Design FundamentalsUnderstand pipeline flows, batch vs streaming, layers, and safe reruns.4 lessons
Lesson 1: What Is a Data Pipeline?Understand source, extraction, transformation, loading, validation, monitoring, and downstream consumers.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 2: Batch vs StreamingCompare batch pipelines, streaming pipelines, scheduled jobs, real-time needs, and when batch is enough.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 3: Pipeline LayersDesign raw, staging, cleaned, curated, reporting, and audit layers.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 4: Idempotency and RerunsBuild rerunnable pipelines, duplicate prevention, deterministic outputs, batch IDs, overwrite vs append logic.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Module 2: Extraction PatternsExtract data from files, APIs, and databases with pagination, high-watermarks, and logs.4 lessons
Lesson 1: File ExtractionHandle file drops, naming conventions, batch folders, archive folders, missing files, and validation.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: API ExtractionHandle API pagination, date filters, incremental extraction, authentication, rate limits, and retries.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

37 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

30 min

Lesson 3: Database ExtractionUse SQL extraction, incremental queries, updated_at fields, high-watermark concept, and performance basics.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Lesson 4: Extraction LoggingLog source, start/end time, records extracted, errors, retry count, and next cursor/high-watermark.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Module 3: Transformation and Loading PatternsBuild reusable transformations, choose load strategies, use configuration, and generate run reports.5 lessons
Lesson 1: Transformation FunctionsRefactor clean, map, standardize, validate, and testable transformations.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 2: Load StrategiesChoose append, overwrite, merge/upsert, partitioned loads, staging-to-final, and failure recovery.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Pipeline ConfigurationUse config files, environment variables, source configs, table configs, schedule configs, and reusable pipelines.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Lesson 4: Pipeline ReportsGenerate run summary, data quality summary, load summary, error summary, and stakeholder notification.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 5: Milestone Project 4 - End-to-End Batch PipelineBuild a batch pipeline that extracts, transforms, validates, loads, logs, supports reruns, and reports pipeline results.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Reliability, Logging and Project StructureImprove pipeline reliability through logging, observability, alerts, debugging, testing, configuration, documentation, and collaboration.3 modules13 lessons1–2 weeks
Module 1: Logging, Monitoring and AlertsAdd operational visibility to pipelines.4 lessons
Lesson 1: Logging FundamentalsReplace print statements with logging, log levels, log files, structured logs, and useful messages.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

27 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

20 min

Lesson 2: Pipeline ObservabilityTrack run status, row counts, duration, error counts, quality failures, and freshness.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Alerts and NotificationsDesign failure, warning, missing file, quality, and notification channel alerts.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 4: Debugging PipelinesRead logs, trace failures, isolate bad data, reproduce errors, fix and rerun.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

35 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

27 min

Module 2: Testing Data Engineering CodeTest functions, pipeline components, data quality checks, and regression bugs.4 lessons
Lesson 1: Testing Python FunctionsWrite unit tests, test cases, expected outputs, edge cases, and pytest basics.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Testing Pipeline ComponentsTest extract, transform, load, fake inputs, and sample outputs independently.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: Data Quality TestsTest schema, nulls, uniqueness, accepted values, and row counts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Regression Testing for PipelinesPrevent old bugs using test datasets, expected files, rerun checks, and safe refactoring.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Module 3: Professional Project StructureStructure repositories, manage config/secrets, document pipelines, and collaborate with Git.5 lessons
Lesson 1: Data Engineering Repository StructureOrganize src, configs, raw/processed data, tests, logs, scripts, and docs.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 2: Configuration and SecretsUse config files, .env, credentials, .gitignore, secret safety, and environment-specific configs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 3: DocumentationWrite README, pipeline overview, setup, source documentation, data dictionary, and runbook.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

32 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

25 min

Lesson 4: Git and CollaborationUse commits, branches, pull requests, code reviews, reproducible work, and documenting changes.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a realistic data engineering workflow.

30 min

Practice Activity

Apply the lesson through a guided data engineering exercise.

22 min

Lesson 5: Mini Project 2 - Pipeline Reliability UpgradeImprove a rough pipeline by adding logging, configuration, tests, documentation, error handling, and structure.110 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

7Phase 7 - CapstoneComplete a production-aware Python data engineering capstone pipeline.1 modules3 lessons1 week
Module 1: Python Data Engineering CapstoneBuild a production-aware Python data pipeline that collects, cleans, validates, loads, logs, tests, and documents data.3 lessons
Lesson 1: Capstone OptionsChoose a realistic data engineering capstone option.55 minarticle1 pages

Choose Your Python Data Engineering Capstone

Review approved capstone options.

55 min

Lesson 2: Final Capstone - Python Data Engineering CapstoneBuild a production-aware Python data pipeline that collects, cleans, validates, loads, logs, tests, and documents data from one or more sources.180 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 3: Graduation Requirements and Portfolio OutcomeClarify completion requirements, portfolio outcomes, path position, and why the course matters.55 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio assets.

55 min

3Beginner to Intermediate7 weeksSQL for Data AnalyticsLearn the SQL skills data analysts use to extract, filter, join, group, and analyze data from relational databases.8 phases8 modules36 lessons107 pages
1Phase 1 - Relational Database FoundationsBuild database thinking before writing SQL: files vs databases, tables, records, relationships, primary keys, foreign keys, and basic data modeling.1 modules5 lessons1–2 weeks
Module 1: Understanding Data and DatabasesUnderstand why databases exist and how relational database structures support business analytics.5 lessons
Lesson 1: Why Databases ExistUnderstand why databases exist, how they differ from files, and why SQL matters for business analytics.75 minarticle5 pages

Welcome and Learning Objectives

Start the course and understand why databases matter.

8 min

Files vs Databases

Explain the difference between files and databases.

16 min

Business Data and Structured Data

Introduce business data and structured records.

17 min

Relational Databases in Plain English

Introduce relational databases and why they matter.

18 min

Exercise - Application Database Discovery

Students identify databases behind common applications.

16 min

Lesson 2: Database FundamentalsUnderstand tables, rows, columns, and records — the foundation of relational database thinking.75 minarticle5 pages

Welcome and Learning Objectives

Introduce the basic building blocks of databases.

8 min

Tables, Rows, Columns and Records

Explain core database structure.

18 min

From Business Entity to Table

Show how real business objects become database tables.

18 min

Design a Simple Customer Database

Guide students through customer table design.

18 min

Exercise - Customer Database Blueprint

Students submit a simple customer database design.

13 min

Lesson 3: RelationshipsUnderstand one-to-one, one-to-many, and many-to-many relationships and how they shape analytical queries.80 minarticle4 pages

Welcome and Learning Objectives

Introduce relationships between tables.

8 min

Types of Relationships

Explain common relationship patterns.

22 min

E-Commerce Relationship Model

Show a realistic relationship model.

25 min

Exercise - E-Commerce Relationship Model

Students model an e-commerce database.

25 min

Lesson 4: Primary Keys and Foreign KeysUnderstand primary keys, foreign keys, constraints, and how they protect data integrity.80 minarticle5 pages

Welcome and Learning Objectives

Introduce keys and integrity.

8 min

Primary Keys

Explain primary keys.

20 min

Foreign Keys and Constraints

Explain foreign keys and constraints.

22 min

Keys in Analytical Queries

Connect keys to SQL analysis.

18 min

Exercise - Key Integrity Diagram

Students create a key-based relationship diagram.

12 min

Lesson 5: Mini Project 1 - Database Blueprint ChallengeDesign a beginner-friendly relational database blueprint for a CRM, LMS, or e-commerce business.90 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Querying DataLearn the SQL fundamentals required to retrieve, filter, sort, and calculate business data.1 modules5 lessons1–2 weeks
Module 1: SQL FundamentalsBuild core SQL query confidence using SELECT, WHERE, sorting, limiting, expressions, and CASE logic.5 lessons
Lesson 1: SELECT StatementsLearn how to select columns, create aliases, and build expressions.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

27 min

Practice Activity

Apply the lesson through a guided SQL exercise.

20 min

Lesson 2: Filtering DataUse WHERE, AND, OR, IN, BETWEEN, and LIKE to retrieve relevant business records.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 3: Sorting and LimitingUse ORDER BY and LIMIT/TOP concepts to rank and inspect records.50 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

25 min

Practice Activity

Apply the lesson through a guided SQL exercise.

17 min

Lesson 4: Calculated FieldsCreate derived columns using arithmetic and CASE statements.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

27 min

Practice Activity

Apply the lesson through a guided SQL exercise.

20 min

Lesson 5: Mini Project 2 - Sales Insight Query PackWrite a practical set of SQL queries for sales, revenue, customer, and product analysis.80 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Aggregation and ReportingUse SQL aggregation to build business KPI reporting datasets.1 modules5 lessons1–2 weeks
Module 1: Business Reporting with SQLUse COUNT, SUM, AVG, MIN, MAX, GROUP BY, HAVING, and KPI logic for reporting.5 lessons
Lesson 1: Aggregate FunctionsUse COUNT, SUM, AVG, MIN, and MAX to summarize business data.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

27 min

Practice Activity

Apply the lesson through a guided SQL exercise.

20 min

Lesson 2: GROUP BYGroup records by category, region, time, or customer segment.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 3: HAVINGFilter aggregated results using HAVING.50 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

25 min

Practice Activity

Apply the lesson through a guided SQL exercise.

17 min

Lesson 4: Business KPI ReportingBuild SQL reports for revenue, customer growth, and retention metrics.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

32 min

Practice Activity

Apply the lesson through a guided SQL exercise.

25 min

Lesson 5: Milestone Project 1 - Executive KPI Reporting DatasetProduce dashboard-ready SQL datasets for executive KPI reporting.100 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - SQL Joins MasteryMaster the SQL joins needed to combine data across business tables accurately.1 modules6 lessons1–2 weeks
Module 1: Combining DataUse joins to connect customers, orders, products, payments, and other business tables.6 lessons
Lesson 1: INNER JOINUse INNER JOIN to return matching records across tables.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

27 min

Practice Activity

Apply the lesson through a guided SQL exercise.

20 min

Lesson 2: LEFT JOINUse LEFT JOIN to preserve records from the left table and analyze missing activity.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 3: RIGHT JOIN and FULL JOINUnderstand completeness analysis using RIGHT JOIN and FULL JOIN concepts.50 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

25 min

Practice Activity

Apply the lesson through a guided SQL exercise.

17 min

Lesson 4: Multi-Table JoinsJoin several tables to build full business intelligence datasets.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

35 min

Practice Activity

Apply the lesson through a guided SQL exercise.

27 min

Lesson 5: Join Optimization ConceptsUnderstand efficient joins, common mistakes, and how poor joins create slow or wrong results.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

27 min

Practice Activity

Apply the lesson through a guided SQL exercise.

20 min

Lesson 6: Project - Customer Revenue AnalyticsBuild customer value, order, and revenue analysis using joins.90 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - Advanced SQL for AnalyticsUse subqueries, CTEs, window functions, trends, and retention analysis for advanced reporting.1 modules6 lessons2 weeks
Module 1: Analytical SQLDevelop advanced analytical SQL patterns for complex business questions.6 lessons
Lesson 1: SubqueriesUse nested and correlated queries for advanced reporting.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 2: Common Table ExpressionsUse WITH statements to organize complex SQL logic.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

32 min

Practice Activity

Apply the lesson through a guided SQL exercise.

25 min

Lesson 3: Window FunctionsUse ROW_NUMBER, RANK, DENSE_RANK, LEAD, and LAG for analytical reporting.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

35 min

Practice Activity

Apply the lesson through a guided SQL exercise.

27 min

Lesson 4: Running Totals and TrendsUse SQL to calculate running totals, growth, and time-series reporting.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

32 min

Practice Activity

Apply the lesson through a guided SQL exercise.

25 min

Lesson 5: Cohort and Retention AnalysisBuild beginner-friendly cohort and retention analysis for subscription or repeat-use businesses.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

37 min

Practice Activity

Apply the lesson through a guided SQL exercise.

30 min

Lesson 6: Milestone Project 2 - Subscription Growth and Retention AnalyticsAnalyze subscription growth, retention, and cohorts with advanced SQL.110 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Real-World Data AnalysisUse SQL to solve customer, product, marketing, and operations problems.1 modules4 lessons1–2 weeks
Module 1: Solving Business ProblemsApply SQL to practical business domains and reporting needs.4 lessons
Lesson 1: Customer AnalyticsUse SQL for customer segmentation, value analysis, and lifetime value concepts.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 2: Product AnalyticsAnalyze product performance, product adoption, and revenue contribution.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 3: Marketing AnalyticsAnalyze campaign performance and conversion metrics using SQL.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

Lesson 4: Operational AnalyticsBuild operational datasets for process, efficiency, and performance analysis.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

7Phase 7 - Professional SQLDevelop production-quality SQL habits for performance, maintainability, documentation, and BI collaboration.1 modules3 lessons1 week
Module 1: Production SQL SkillsWrite SQL that is readable, maintainable, dashboard-ready, and safer for production-scale work.3 lessons
Lesson 1: Query OptimizationLearn how to read queries, recognize performance basics, and understand index concepts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

32 min

Practice Activity

Apply the lesson through a guided SQL exercise.

25 min

Lesson 2: Documentation and MaintainabilityWrite SQL that other analysts and engineers can understand and reuse.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

27 min

Practice Activity

Apply the lesson through a guided SQL exercise.

20 min

Lesson 3: Working with BI ToolsPrepare SQL datasets for Power BI, dashboards, and analytics platforms.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Business Examples

Explain the concept with practical business examples.

30 min

Practice Activity

Apply the lesson through a guided SQL exercise.

22 min

8Phase 8 - Capstone, Graduation and PortfolioComplete an end-to-end SQL analytics capstone and package portfolio-ready SQL work.1 modules2 lessons1 week
Module 1: End-to-End Business Analytics CapstoneStudents complete an industry-based SQL analytics capstone and prepare their portfolio outcome.2 lessons
Lesson 1: Final Capstone - End-to-End Business Analytics ProjectBuild a professional SQL analytics project for one selected industry.150 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 2: Graduation Requirements and Portfolio OutcomeClarify completion requirements and expected portfolio outputs.45 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio assets.

45 min

4Intermediate7 weeksPath onlyData WarehousingLearn how modern teams organize trusted business data for analytics, reporting, and decision-making.8 phases16 modules72 lessons211 pages
1Phase 1 - Data Warehouse FoundationsBuild the conceptual foundation for warehouses: why they exist, how they differ from operational systems, and where they fit in the modern data stack.2 modules9 lessons1–2 weeks
Module 1: Why Data Warehouses ExistUnderstand the business and technical problems that make warehouses necessary.4 lessons
Lesson 1: The Analytics Problem in Growing CompaniesUnderstand why growing companies struggle with scattered data, conflicting reports, slow dashboards, Excel chaos, and inconsistent KPI definitions.85 minarticle6 pages

Welcome and Learning Objectives

Introduce the business problem that creates the need for warehouses.

8 min

Scattered Data and Conflicting Reports

Explain scattered data problems.

18 min

Why Excel Chaos Happens

Explain spreadsheet-driven reporting issues.

18 min

Operational Systems Are Not Built for Analytics

Explain why production systems should not power every report.

20 min

Leadership Needs Trusted Data

Explain the executive need.

18 min

Exercise - Conflicting Revenue Scenario

Students analyze a company scenario with conflicting revenue numbers.

21 min

Lesson 2: What Is a Data Warehouse?Explain a data warehouse as a central analytical store for historical, integrated, subject-oriented, trusted data that supports BI, analytics, and decisions.85 minarticle5 pages

Welcome and Learning Objectives

Introduce the warehouse concept.

8 min

Warehouse Definition in Plain English

Explain the concept simply.

20 min

Warehouse Use Cases

Show where warehouses support teams.

20 min

Trusted Reporting Layer

Explain trust and documentation.

18 min

Exercise - Explain the Warehouse

Students explain the warehouse to a business manager.

39 min

Lesson 3: Operational Database vs Data WarehouseDifferentiate OLTP and OLAP systems, transaction processing, analytical querying, normalized operational systems, analytical models, read-heavy workloads, and why production databases should not power all analytics.85 minarticle4 pages

Welcome and Learning Objectives

Introduce OLTP vs OLAP.

8 min

OLTP and OLAP

Explain operational vs analytical systems.

22 min

Why Production Databases Should Not Power Every Dashboard

Explain production risk.

20 min

Exercise - Operational or Analytical?

Students classify use cases.

35 min

Lesson 4: Data Warehouse vs Data Lake vs LakehouseCompare warehouses, lakes, and lakehouses, including structured and semi-structured data, raw storage, curated analytics, tradeoffs, and where AI/ML fit.85 minarticle4 pages

Welcome and Learning Objectives

Introduce warehouse, lake, and lakehouse distinctions.

8 min

Warehouse, Lake and Lakehouse

Explain the three architectures.

24 min

Where AI and ML Fit

Connect architectures to AI and ML.

20 min

Exercise - Architecture Choice Scenarios

Students choose the right architecture for business scenarios.

37 min

Module 2: The Modern Warehouse EcosystemUnderstand the warehouse ecosystem, platforms, users, success criteria, and high-level architecture.5 lessons
Lesson 1: Where Warehouses Fit in the Data StackMap source systems, ingestion, storage, transformation, orchestration, BI tools, ML/AI consumers, and governance.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Common Warehouse PlatformsCompare BigQuery, Snowflake, Redshift, Databricks SQL, Azure Synapse/Fabric Warehouse, PostgreSQL as small warehouse, and cloud-native tradeoffs.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 3: Warehouse Users and ConsumersMap executives, analysts, BI developers, data scientists, engineers, AI systems, product, and operations teams to datasets.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

27 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

20 min

Lesson 4: Warehouse Success CriteriaDefine success using trust, performance, freshness, consistency, documentation, cost, access control, maintainability, and usefulness.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 5: Mini Project 1 - Warehouse Architecture ReviewReview a messy company data scenario and propose a high-level warehouse architecture.100 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Data Modeling Foundations for WarehousesBuild modeling depth required for warehouse design: entities, grain, normalized/denormalized models, keys, relationships, many-to-many patterns, and assumptions.2 modules9 lessons1–2 weeks
Module 1: Understanding Data ModelsUnderstand entities, attributes, relationships, grain, normalized models, and analytical models.4 lessons
Lesson 1: What Is Data Modeling?Explain entities, attributes, relationships, business rules, data structure as business meaning, modeling importance, and trust.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Data GrainDefine row-level meaning across transaction, daily, customer, product, and event grains.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 3: Normalized ModelsUse normalization, reduced duplication, relational integrity, transactional systems, operational databases, and source structure.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Denormalized Analytical ModelsUse denormalization, analytics performance, reporting convenience, wide tables, metric consistency, benefits, and risks.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Module 2: Keys, Relationships and Modeling RulesDesign stable keys, relationships, referential checks, many-to-many patterns, and documentation.5 lessons
Lesson 1: Primary Keys and Natural KeysChoose primary, natural, surrogate, composite, stable, business, and warehouse keys.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 2: Foreign Keys and Referential IntegrityUse foreign keys, parent-child relationships, orphan records, referential checks, and fact-to-dimension relationships.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 3: Many-to-Many RelationshipsModel bridge tables, junction tables, enrollments, product categories, user roles, tags, and analytics issues.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 4: Modeling Assumptions and DocumentationDocument business assumptions, model assumptions, ownership, definitions, and dashboard-impacting assumptions.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 5: Milestone Project 1 - Warehouse Modeling FoundationDesign a warehouse modeling foundation for LMS, CRM, e-commerce, fintech, healthcare, logistics, or SaaS.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Warehouse Architecture and LayersDesign layered warehouse architecture, naming standards, mapping, lineage, and documentation.2 modules10 lessons1–2 weeks
Module 1: Warehouse Layering StrategyDesign raw, staging, intermediate, curated, and mart layers with separation of concerns.5 lessons
Lesson 1: Why Warehouse Layers MatterExplain raw, staging, intermediate, curated, mart layers, separation of concerns, and source-to-consumer flow.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

27 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

20 min

Lesson 2: Raw Layer DesignDesign raw storage with source tracking, load timestamps, schema drift, auditability, immutability, and traceability.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 3: Staging Layer DesignDesign one-to-one cleanup, renamed fields, data casts, light standardization, source meaning preservation, and no heavy logic.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Intermediate and Curated LayersDesign reusable business logic, source combinations, entity/event models, dependency management, and transformation reuse.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 5: Data Mart LayerDesign department-ready sales, finance, product, education, and operations marts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Module 2: Warehouse Naming, Organization and LineageCreate professional naming standards, source-to-target mappings, lineage diagrams, and documentation.5 lessons
Lesson 1: Warehouse Naming ConventionsUse schemas/databases, table names, column names, prefixes, suffixes, layer naming, and avoid unclear names.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Source-to-Target MappingDocument source fields, target fields, transformation notes, quality concerns, refresh frequency, ownership, and lineage.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 3: Data LineageShow source-to-target flow, dependencies, impact, lineage diagrams, debugging, and BI impact.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Warehouse Documentation StandardsWrite table descriptions, column descriptions, metric definitions, owners, freshness, limitations, and assumptions.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 5: Milestone Project 2 - Warehouse Layer DesignDesign warehouse layers, naming standards, mappings, lineage, and documentation plan for one domain.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - Dimensional Warehouse DesignDesign dimensional models using business processes, fact tables, dimensions, star schemas, conformed dimensions, date dimensions, and advanced patterns.2 modules10 lessons2 weeks
Module 1: Dimensional Modeling for WarehousesUnderstand analytical modeling, business processes, fact tables, and dimension tables.4 lessons
Lesson 1: Why Analytical Models Are DifferentCompare operational systems and analytical systems for speed, business-friendliness, repeatable KPIs, history, BI, and trust.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Business Process ModelingIdentify business processes, events, transactions, lifecycle milestones, measurable activities, and process matrix.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 3: Fact TablesDesign facts, measures, event/transaction grains, transaction facts, periodic snapshots, accumulating snapshots, and factless facts.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

37 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

30 min

Lesson 4: Dimension TablesDesign descriptive attributes, customer/product/date/location/status/user dimensions, and slowly changing attributes.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Module 2: Star Schema and Advanced Dimensional PatternsDesign star schemas, snowflakes, conformed dimensions, date dimensions, role-playing dates, and degenerate dimensions.6 lessons
Lesson 1: Star Schema DesignDesign fact-centered star schemas with dimensions, BI usability, simple queries, performance, and avoid common mistakes.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 2: Snowflake SchemaUnderstand normalized dimensions, when snowflaking helps, complexity tradeoffs, storage vs usability, and BI experience.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 3: Conformed DimensionsDesign shared dimensions across multiple fact tables for consistent reporting.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 4: Date DimensionsDesign calendar tables, fiscal periods, week/month/quarter/year, business calendars, time intelligence, and role-playing dates.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 5: Role-Playing and Degenerate DimensionsUse order/payment/delivery dates, invoice numbers, transaction references, order numbers, and operational identifiers in facts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 6: Milestone Project 3 - Dimensional Warehouse ModelProduce a complete dimensional warehouse model.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - Historical Data and Slowly Changing DimensionsDesign history-aware warehouses using SCD strategies, full refresh, incremental loads, snapshots, late data, and backfills.2 modules9 lessons1–2 weeks
Module 1: History in Data WarehousesUnderstand why history matters and choose SCD strategies.4 lessons
Lesson 1: Why History MattersTrack changing customer details, product categories, subscription plans, employee assignments, and current vs historical truth.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Slowly Changing Dimensions OverviewCompare SCD Type 0, Type 1, Type 2, Type 3 high level, selection strategy, and business tradeoffs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 3: Type 1 DimensionsDesign overwrite logic, current-state reporting, acceptable Type 1 cases, history loss risk, and update logic.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Type 2 DimensionsDesign effective_from, effective_to, is_current, surrogate keys, change detection, historical joins, and versioning.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

37 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

30 min

Module 2: Warehouse Loading and History StrategiesDesign full refresh, incremental load, snapshot, late-arriving data, and backfill strategies.5 lessons
Lesson 1: Full RefreshUnderstand rebuild-from-scratch, simplicity, cost, performance, when it works, and when it fails.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

27 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

20 min

Lesson 2: Incremental LoadsDesign append, merge, high-watermark, updated_at, batch IDs, late-arriving records, and idempotency concepts.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 3: SnapshotsDesign periodic snapshots, daily balances, subscriptions, inventory, account status, and why snapshots matter.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 4: Late-Arriving Data and BackfillsPlan delayed events, corrections, reprocessing, backfills, dashboard impact, and historical repair.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 5: Milestone Project 4 - Historical Warehouse DesignProduce a history-aware warehouse design package.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Data Marts, Metrics and BI ReadinessDesign data marts, metrics layers, KPI tables, aggregate tables, and BI-ready consumption models.2 modules9 lessons1–2 weeks
Module 1: Designing Data MartsBuild subject-specific marts and self-service analytics models.4 lessons
Lesson 1: What Is a Data Mart?Define subject-specific analytics, business department views, and common marts.55 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

27 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

20 min

Lesson 2: Department-Focused MartsDesign marts around sales, finance, customer, operations, product, learning metrics, and stakeholder needs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 3: Wide Reporting TablesUse dashboard-ready views, denormalized reporting tables, BI convenience, performance, and safe self-service.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Self-Service AnalyticsDesign analyst-friendly models, naming clarity, metric definitions, safe joins, discoverability, and reduced confusion.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Module 2: Metrics Layer and KPI ConsistencyDesign reusable metric definitions, KPI tables, aggregate tables, and BI-ready marts.5 lessons
Lesson 1: Why Metric Definitions BreakDiagnose inconsistent revenue, active user, churn, filters, grains, spreadsheet drift, and dashboard trust problems.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Designing a Metrics LayerDefine metric definitions, dimensions, filters, time windows, ownership, documentation, and reuse.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

35 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

27 min

Lesson 3: KPI Tables and Aggregate TablesDesign summary tables, daily KPI tables, performance pre-aggregation, refresh schedules, and ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: BI Tool ReadinessPrepare for Power BI, Tableau/Looker concepts, semantic models, date tables, relationships, field naming, and RLS implications.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 5: Milestone Project 5 - BI-Ready Data MartBuild a BI-ready mart design for one business team.125 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

7Phase 7 - Warehouse Performance, Governance and OperationsDesign-level warehouse performance, cost, governance, security, access, data contracts, refresh, monitoring, backfills, and runbooks.3 modules13 lessons1–2 weeks
Module 1: Warehouse Performance and Cost AwarenessUnderstand performance and cost implications of warehouse design.4 lessons
Lesson 1: Query Performance in WarehousesReview table scans, joins, aggregations, sorting, filters, expensive queries, and model design impact.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Partitioning and Clustering ConceptsChoose date partitioning, pruning, clustering keys, sort keys, distribution concepts, and platform differences.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 3: Pre-Aggregation and MaterializationUse materialized tables, views, aggregate tables, incremental marts, acceleration, and refresh tradeoffs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Cost-Aware Warehouse DesignControl storage, compute, query, refresh, incremental processing, retention, and unnecessary wide tables.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Module 2: Governance, Security and AccessDesign ownership, access control, privacy, compliance concepts, and data contracts.4 lessons
Lesson 1: Data Governance BasicsAssign ownership, definitions, stewardship, quality responsibility, documentation, approval flows, and metric governance.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Access ControlDesign role-based access, department access, sensitive data handling, RLS, column masking, least privilege, and BI implications.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 3: Privacy and Compliance ConceptsHandle personal data, retention, deletion, consent, audit logs, minimization, and sensitive mart fields.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 4: Data ContractsDraft source expectations, schema stability, breaking changes, producer/consumer agreements, contract testing, and ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Module 3: Warehouse OperationsDesign refresh schedules, monitoring metrics, backfill plans, and support runbooks.5 lessons
Lesson 1: Refresh StrategyDesign daily, hourly, near-real-time, batch windows, dependency order, and freshness SLAs.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

30 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

22 min

Lesson 2: Monitoring Warehouse HealthDefine freshness, row count, failed transformation, stale mart, query failure, complaint, and quality metrics.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 3: Backfills and ReprocessingPlan historical corrections, schema changes, new logic, rerunning old periods, risk, and validation.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 4: Warehouse RunbooksWrite incident response, failed load, wrong dashboard, late source, access request, troubleshooting, and escalation guides.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional warehouse design workflow.

32 min

Practice Activity

Apply the lesson through a guided warehouse design exercise.

25 min

Lesson 5: Mini Project 2 - Warehouse Governance and Operations PackageProduce governance and operations assets for a warehouse design.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

8Phase 8 - CapstoneComplete a full data warehouse design capstone for a realistic business domain.1 modules3 lessons1–2 weeks
Module 1: Data Warehouse Design CapstoneDesign a complete analytics warehouse covering architecture, modeling, layers, facts, dimensions, history, marts, metrics, governance, BI readiness, documentation, and operations.3 lessons
Lesson 1: Capstone OptionsChoose a realistic business domain for the warehouse design capstone.55 minarticle1 pages

Choose Your Data Warehouse Capstone

Review approved capstone options.

55 min

Lesson 2: Final Capstone - Data Warehouse Design CapstoneDesign a complete analytics warehouse for a realistic business domain.220 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 3: Graduation Requirements and Portfolio OutcomeClarify completion requirements and portfolio assets.55 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio outcomes.

55 min

5Intermediate8 weeksPath onlyETL & ELT PipelinesLearn how to move data from source systems into databases, warehouses, and analytics layers using practical ETL and ELT pipeline workflows.8 phases20 modules89 lessons262 pages
1Phase 1 - Pipeline FoundationsBuild core pipeline thinking: pipeline components, ETL vs ELT, timing patterns, consumers, architecture, design principles, and failure modes.2 modules9 lessons1–2 weeks
Module 1: Understanding Data PipelinesUnderstand what pipelines are, how ETL/ELT works, timing choices, and downstream consumers.4 lessons
Lesson 1: What Is a Data Pipeline?Understand a data pipeline as a repeatable workflow that moves data from sources through extraction, loading, transformation, validation, scheduling, monitoring, and downstream consumption.85 minarticle6 pages

Welcome and Learning Objectives

Introduce the full pipeline concept.

8 min

Pipeline Building Blocks

Explain core pipeline components.

18 min

Pipeline Flow Example

Show a realistic data pipeline flow.

18 min

BI, Analytics, AI and ML Consumers

Connect pipelines to downstream consumers.

18 min

Reliability, Validation and Monitoring

Explain why pipeline operations matter.

18 min

Exercise - Draw a Business Data Pipeline

Students draw a pipeline for a realistic business domain.

23 min

Lesson 2: ETL vs ELTCompare extract-transform-load and extract-load-transform workflows, where transformation happens, warehouse-first design, Python-heavy ETL, SQL-heavy ELT, and tradeoffs.85 minarticle5 pages

Welcome and Learning Objectives

Introduce ETL and ELT.

8 min

ETL Explained

Explain extract-transform-load.

20 min

ELT Explained

Explain extract-load-transform.

20 min

Tradeoffs and Decision Rules

Compare ETL and ELT decisions.

18 min

Exercise - ETL or ELT Decision Matrix

Students choose ETL or ELT for different scenarios.

39 min

Lesson 3: Batch, Near Real-Time and StreamingClassify pipelines by latency needs, scheduled jobs, event streams, cost, complexity, and why batch is often enough.80 minarticle4 pages

Welcome and Learning Objectives

Introduce pipeline latency patterns.

8 min

Three Pipeline Timing Patterns

Explain timing patterns.

24 min

Cost and Complexity Tradeoffs

Explain tradeoffs.

18 min

Exercise - Timing Pattern Classifier

Students classify use cases.

30 min

Lesson 4: Pipeline ConsumersMap pipeline outputs to BI dashboards, analytics notebooks, ML, AI systems, operational reporting, data marts, reverse ETL, and stakeholder expectations.80 minarticle4 pages

Welcome and Learning Objectives

Introduce downstream consumers.

8 min

Common Consumers

Explain pipeline consumers.

20 min

Consumer Quality Expectations

Explain quality expectation mapping.

22 min

Exercise - Consumer Output Mapping

Students map outputs to consumers and expectations.

30 min

Module 2: Pipeline Architecture and DesignDesign source-to-target flows, layers, principles, and failure controls before writing pipeline code.5 lessons
Lesson 1: Source-to-Target ThinkingMap source systems to raw, staging, intermediate, curated, and mart targets with lineage and ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 2: Pipeline LayersBreak pipelines into ingestion, staging, transformation, validation, publishing, monitoring, archiving, and quarantine layers.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 3: Pipeline Design PrinciplesApply reliability, repeatability, idempotency, observability, modularity, recoverability, documentation, and auditability.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: Pipeline Failure ModesPlan for missing files, API failures, bad data, schema drift, duplicates, partial failures, broken transformations, stale dashboards, and wrong logic.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 5: Mini Project 1 - Pipeline Design BriefStudents choose one domain and produce a professional pipeline design brief.100 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Data Extraction PatternsExtract data reliably from files, APIs, and databases with metadata, pagination, incremental logic, performance, retries, and logs.3 modules13 lessons1–2 weeks
Module 1: File-Based ExtractionDesign and build reliable file-drop pipelines with validation, archiving, reprocessing, and extraction logs.4 lessons
Lesson 1: File Drop PipelinesDesign daily files, folder conventions, naming, batch dates, source tracking, inventory, and missing-file handling.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Reading Files Reliably with PythonRead CSV, Excel, JSON with encoding issues, malformed files, missing columns, empty files, and schema checks.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 3: File Archiving and ReprocessingDesign raw archive, processed folder, rejected folder, reruns, originals preservation, auditability, and source evidence.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: File Extraction LogsGenerate logs with filename, size, row count, checksum/hash concept, timestamp, status, and error reason.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Module 2: API ExtractionExtract API data using endpoints, pagination, incremental filters, retries, and failure handling.4 lessons
Lesson 1: API Data SourcesInspect REST APIs, endpoints, query parameters, headers, keys, JSON responses, and source contracts.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: PaginationHandle page-based, cursor-based, limit/offset, next links, stopping conditions, and duplicate page risks.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 3: Incremental API ExtractionUse date filters, updated_since, cursors, high-watermarks, last successful run, and backfill windows.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 4: Rate Limits, Retries and FailuresHandle 429 errors, timeouts, retries, exponential backoff, partial success, provider outages, and retry limits.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Module 3: Database ExtractionExtract from relational databases safely with incremental logic, CDC awareness, and performance controls.5 lessons
Lesson 1: Extracting from Relational DatabasesUse connection strings, credentials, SELECT extraction, filtered extraction, chunked reads, query limits, and source safety.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 2: Incremental Database ExtractionUse updated_at, created_at, high-watermark table, last successful run, deleted records problem, and late updates.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 3: Change Data Capture ConceptsCompare CDC, inserts, updates, deletes, log-based capture, and simple incremental extraction.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: Extraction PerformanceOptimize indexes, filters, date partitions, batch sizes, source load, extraction windows, and operational safety.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 5: Milestone Project 1 - Multi-Source Extraction PipelineBuild or simulate extraction from files, API, and database sources.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Loading, Raw Storage and StagingLoad raw data safely, design raw storage, build production staging layers, add metadata, validate staging, and quarantine bad records.2 modules10 lessons1–2 weeks
Module 1: Loading Raw DataPreserve source data and load raw outputs into files, databases, or staging environments with metadata.4 lessons
Lesson 1: Raw Data Loading PrinciplesPreserve source data, load first, transform later, raw tables/files, source traceability, batch IDs, timestamps, and immutable raw concept.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Loading to FilesWrite CSV, JSON, Parquet concept, folder partitions, batch-based storage, naming, and source/date partitions.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Loading to DatabasesLoad raw tables, staging tables, append/replace loads, bulk loading, data type mapping, and handle failures.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Load MetadataTrack load_id, batch_id, run_id, source name, loaded_at, row count, status, error message, and duration.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Module 2: Production Staging LayerBuild staging models with standards, SQL implementation, validation, and rejected-record workflows.6 lessons
Lesson 1: Why Staging ExistsExplain raw vs staging, schema standardization, type casting, renaming, light cleaning, lineage, and one-to-one cleanup.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Staging Table StandardsDefine naming, data types, source columns, audit columns, unique keys, record status, and tracking standards.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Staging SQL ImplementationUse SELECT from raw, aliases, casting, trimming, date parsing, null normalization, and CASE expressions.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 4: Staging ValidationsWrite schema, row count, null, duplicate, accepted value, and source consistency checks.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 5: Rejected and Quarantined DataHandle invalid records, rejection reasons, quarantine tables, error files, review process, reprocessing, and reporting.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 6: Milestone Project 2 - Raw-to-Staging LoadBuild a loading workflow that stores raw extracted data, creates staging tables, validates data, quarantines bad records, and records load status.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - Production SQL and Transformation PatternsBuild production transformation logic with modular SQL/Python, deduplication, standardization, safe joins, marts, tests, contracts, and reviews.3 modules15 lessons2 weeks
Module 1: Transformation DesignDecide transformation responsibilities, tool placement, modular logic, and lineage documentation.4 lessons
Lesson 1: Transformation ResponsibilitiesClassify cleaning, standardization, deduplication, joins, rules, aggregations, intermediate models, and mart creation.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Python vs SQL TransformationsChoose Python or SQL based on task, location, warehouse compute, maintainability, performance, and team workflow.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Modular Transformation LogicDesign small transformation units, reusable functions, SQL models, intermediate outputs, and dependency management.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Transformation LineageDocument source-to-target mapping, dependencies, column mapping, business rules, and downstream impact.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Module 2: Production SQL Transformation PatternsUse SQL patterns for deduplication, standardization, business rules, safe joins, aggregates, and marts.5 lessons
Lesson 1: Deduplication with SQLUse ROW_NUMBER, partitioning, latest record logic, source priority, and deterministic deduplication.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 2: Standardization SQLStandardize category, code, status, date, currency, country/region, and null-like strings.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 3: Business Rule TransformationsUse CASE expressions, customer segments, order status flags, transaction risk flags, completion flags, and derived columns.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Joining Sources SafelyHandle join keys, one-to-many issues, unmatched records, many-to-many traps, relationship validation, and fanout risk.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 5: Aggregation and Mart CreationCreate daily, customer, product, cohort, KPI, and BI-ready mart aggregates.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Module 3: Transformation Testing and ReviewTest transformation logic, data contracts, SQL reviews, and transformation models.6 lessons
Lesson 1: Transformation Unit TestsTest inputs, expected outputs, edge cases, pure transformations, regression tests, and SQL model tests.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 2: Business Rule TestsTest statuses, categories, formulas, KPIs, thresholds, and business expectations.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Data Contract ChecksWrite source expectations, required columns, expected types, allowed changes, and producer-consumer agreements.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: SQL Code Review for PipelinesReview logic, joins, grain, performance, naming, and business rules in SQL transformations.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 5: Transformation ReviewReview logic, performance, downstream impact, documentation, and stakeholder approval.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 6: Milestone Project 3 - Production Transformation PipelineTransform staged data into curated and reporting-ready outputs.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - Incremental Pipelines and Change HandlingUpgrade full-refresh pipelines using high-watermarks, append-only design, merge/upsert logic, schema drift handling, change detection, deletes, history, idempotency, reruns, backfills, and recovery.3 modules13 lessons2 weeks
Module 1: Incremental Pipeline PatternsDesign incremental pipelines for growing data, cost, speed, reruns, freshness, and performance.4 lessons
Lesson 1: Why Incremental Pipelines MatterUnderstand growing data, cost, speed, reruns, daily loads, historical data, freshness, and warehouse performance.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: High-Watermark PatternUse last successful run, updated_at, created_at, extraction state, missed updates, late records, and state tables.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 3: Append-Only PipelinesDesign event, transaction, log, immutable data, duplicate risk, partitioning, and late-record handling.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Merge and Upsert PipelinesDesign changing records, natural keys, updates, conflicts, idempotency, merge strategy, and staging-to-final pattern.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Module 2: Change Detection and HistoryDetect schema drift, changed records, deletes, and historical changes.4 lessons
Lesson 1: Schema DriftDetect new columns, missing columns, changed types, renamed fields, breaking changes, and contract violations.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 2: Change DetectionCompare records, checksums/hashes, updated_at logic, changed fields, source-of-truth, and record_hash.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 3: Deletes and Soft DeletesHandle hard deletes, soft deletes, is_deleted flags, tombstones, tracking, compliance, and reporting impact.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Historical TrackingDesign snapshots, SCD, Type 1 updates, Type 2 inserts, audit history, and record versions.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Module 3: Idempotency and Safe RerunsDesign safe reruns, backfills, rollback, resume, and recovery from partial failure.5 lessons
Lesson 1: What Idempotency MeansExplain safe reruns, duplicate prevention, deterministic outputs, overwrite vs append, run IDs, and batch IDs.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Designing Rerunnable LoadsUse delete-and-reload partition, merge logic, temp tables, staging-to-final swap, transaction boundaries, and rerun audit.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 3: BackfillsPlan historical reloads, date windows, batch ranges, dependencies, tests, and rollback.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Recovery from Partial FailureRecover from failed extract/load/transformation using rollback, retry, resume, and checkpoints.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 5: Milestone Project 4 - Incremental Pipeline UpgradeUpgrade a full-refresh pipeline to support incremental extraction, change handling, safe reruns, and backfills.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Data Quality, Reconciliation and ReliabilityAdd quality gates, validation, rejected record reporting, reconciliation, auditability, retries, checkpoints, dependencies, and operational readiness.3 modules13 lessons1–2 weeks
Module 1: Data Quality in PipelinesDefine quality dimensions, gates, automated validation, rejected records, and reports.4 lessons
Lesson 1: Data Quality DimensionsDefine completeness, uniqueness, validity, consistency, freshness, accuracy, timeliness, and lineage quality.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Quality GatesDesign warning vs failure, hard stops, soft alerts, thresholds, tolerance, downstream protection, and gate placement.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Automated ValidationBuild schema, row count, null, uniqueness, accepted value, relationship, and reconciliation checks.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

37 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

30 min

Lesson 4: Rejected Records and Quality ReportsReport invalid rows, rejection reasons, failed/warning checks, rejected records, counts, and stakeholder visibility.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Module 2: Reconciliation and AuditabilityReconcile row counts, financial totals, metrics, audit columns, and reports.4 lessons
Lesson 1: Row Count ReconciliationCompare source, loaded, transformed, rejected counts, batch reconciliation, and mismatch detection.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 2: Financial and Metric ReconciliationCompare source totals, transformed totals, dashboard totals, tolerance thresholds, and trust reports.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 3: Audit ColumnsAdd created_at, updated_at, loaded_at, processed_at, batch_id, run_id, source_file, record_hash, and effective dates.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: Reconciliation ReportsGenerate validation summaries, failures, warnings, pass/fail thresholds, visibility, and audit trail.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Module 3: Pipeline Reliability EngineeringAdd retries, checkpoints, dependency management, operational readiness, and runbooks.5 lessons
Lesson 1: Retries and TimeoutsHandle transient failures, retries, exponential backoff, timeouts, retry limits, and escalation.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 2: CheckpointsSave progress, resume from steps, intermediate outputs, pipeline state, and avoid full restart.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Dependency ManagementManage upstream/downstream dependencies, task order, availability, dependency failures, and readiness checks.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: Operational ReadinessWrite runbooks, ownership, support process, known failure modes, windows, rollback, and escalation.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 5: Milestone Project 5 - Reliable Pipeline Operations PackageAdd operational reliability to a pipeline.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

7Phase 7 - Monitoring, Documentation and HandoffDocument, monitor, hand off, review, and package pipelines for consumers, source owners, reviewers, and portfolios.3 modules13 lessons1 week
Module 1: Logging, Monitoring and AlertingAdd structured logs, run tracking, monitoring metrics, alerts, and incident response.4 lessons
Lesson 1: Pipeline LoggingUse structured logs, run IDs, task logs, error logs, row counts, duration, status, and context.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Pipeline Run TrackingCreate pipeline_run and task_run tables with status, start/end time, errors, retries, and records processed.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Monitoring MetricsTrack success rate, failure rate, duration, freshness, row anomalies, quality failures, and rejected counts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: Alerts and Incident ResponseDesign missing data, failed run, freshness, anomaly alerts, escalation, runbook, and ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Module 2: Pipeline DocumentationWrite README, source-to-target mapping, data dictionary, and runbook documentation.4 lessons
Lesson 1: Pipeline READMEDocument purpose, sources, outputs, schedule, setup, run instructions, dependencies, and limitations.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Source-to-Target MappingDocument source fields, target fields, transformations, rules, owners, assumptions, and lineage.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 3: Data DictionaryCreate table descriptions, column descriptions, types, definitions, examples, quality notes, and refresh notes.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 4: Pipeline RunbookDocument normal run, failed run, backfill, rerun, alert response, ownership, and troubleshooting.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Module 3: Handoff and CollaborationCollaborate with analysts, BI teams, source owners, reviewers, and portfolio audiences.5 lessons
Lesson 1: Working with Analysts and BI TeamsUnderstand reporting needs, metric definitions, dashboards, change communication, expectations, and consumer docs.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 2: Working with Source System OwnersHandle schema changes, reliability, data contracts, ownership, communication, and breaking changes.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

30 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

22 min

Lesson 3: Code Review for PipelinesReview extraction, transformation, quality checks, idempotency, error handling, SQL, and documentation.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

35 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

27 min

Lesson 4: Portfolio PackagingPackage project story, architecture diagram, flow, screenshots/logs, sample outputs, README, limitations, and next improvements.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional pipeline workflow.

32 min

Practice Activity

Apply the lesson through a guided pipeline exercise.

25 min

Lesson 5: Mini Project 2 - Pipeline Handoff PackageCreate a complete handoff package for a pipeline.110 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

8Phase 8 - CapstoneBuild a complete production-aware ETL or ELT capstone pipeline.1 modules3 lessons1–2 weeks
Module 1: ETL and ELT Pipeline CapstoneStudents build a complete production-aware ETL/ELT pipeline with extraction, loading, staging, transformation, quality, reconciliation, monitoring, documentation, and presentation.3 lessons
Lesson 1: Capstone OptionsChoose a realistic ETL/ELT capstone domain.55 minarticle1 pages

Choose Your ETL and ELT Pipeline Capstone

Review approved capstone options.

55 min

Lesson 2: Final Capstone - ETL and ELT Pipeline CapstoneBuild a complete production-aware ETL or ELT pipeline from realistic sources to analytics-ready outputs with reliability and handoff documentation.220 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 3: Graduation Requirements and Portfolio OutcomeClarify completion requirements and portfolio outputs.55 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio assets.

55 min

6Intermediate8 weeksPath onlyOrchestration with Airflow & dbtLearn how to manage reliable data workflows using Airflow for orchestration and dbt for structured, tested, analytics-ready transformations.9 phases18 modules79 lessons230 pages
1Phase 1 - Orchestration and Modern Data Stack FoundationsBuild orchestration thinking: manual pipeline risks, task coordination, tool responsibilities, modern data stack design, and Airflow/dbt strategy.2 modules9 lessons1–2 weeks
Module 1: Why Orchestration MattersUnderstand why manual pipelines fail and how orchestration creates scheduled, visible, dependency-aware workflows.4 lessons
Lesson 1: The Problem with Manual PipelinesUnderstand why manual scripts, cron jobs, hidden failures, dependency confusion, and weak ownership create unreliable data workflows.85 minarticle6 pages

Welcome and Learning Objectives

Introduce why manual pipelines become dangerous.

8 min

Manual Scripts and Cron Jobs

Explain why early workflows are often manual.

18 min

Hidden Failures and No Visibility

Explain operational blindness.

18 min

Dependency Confusion

Explain why task order matters.

18 min

Weak Operational Ownership

Explain people and support issues.

18 min

Exercise - Manual Pipeline Risk Review

Students review a manual pipeline process and identify operational risks.

23 min

Lesson 2: What Is Orchestration?Understand orchestration as task scheduling, dependency management, retries, monitoring, state, history, ownership, and operational visibility.85 minarticle4 pages

Welcome and Learning Objectives

Introduce orchestration.

8 min

Orchestration Building Blocks

Explain the core concepts.

22 min

Why Airflow Is Useful

Connect orchestration to Airflow.

20 min

Exercise - Orchestrated Sales Workflow

Students draw an orchestrated workflow.

35 min

Lesson 3: Orchestration vs TransformationSeparate orchestration, transformation, Airflow, dbt, warehouse, extraction, loading, testing, reporting, and monitoring responsibilities.85 minarticle4 pages

Welcome and Learning Objectives

Introduce tool responsibility separation.

8 min

Layer Responsibilities

Explain the difference.

24 min

Avoiding Tool Misuse

Teach boundaries.

20 min

Exercise - Responsibility Classifier

Students classify pipeline steps.

35 min

Lesson 4: Modern Data Stack OverviewDesign a modern data stack with source systems, ingestion, raw/staging layers, warehouse/lake, dbt transformations, orchestration, BI, monitoring, documentation, and consumers.85 minarticle4 pages

Welcome and Learning Objectives

Introduce the modern data stack.

8 min

Modern Data Stack Components

Explain common components.

24 min

Domain Stack Examples

Show domain examples.

20 min

Exercise - Modern Data Stack Design

Students design a stack for a domain.

33 min

Module 2: Tooling StrategyUnderstand Airflow, dbt, Python scripts, setup needs, and tool-selection decisions.5 lessons
Lesson 1: Airflow OverviewExplain Airflow DAGs, operators, scheduler, web UI, logs, retries, and backfills.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: dbt OverviewExplain dbt SQL models, refs, sources, tests, docs, lineage, and analytics engineering workflow.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 3: Airflow vs dbt vs Python ScriptsChoose when to use Python scripts, Airflow, dbt, or combined workflows.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 4: Local Development SetupSet up Python environment, Docker overview, Airflow, dbt, database/warehouse target, project folders, and environment variables.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 5: Mini Project 1 - Orchestration Design BriefStudents choose a pipeline scenario and produce an orchestration design brief.110 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Airflow FundamentalsBuild Airflow foundations: architecture, DAGs, tasks, operators, dependencies, scheduling, runs, catchup, backfills, and UI debugging.2 modules9 lessons1–2 weeks
Module 1: Airflow Core ConceptsUnderstand Airflow architecture, DAGs, tasks, operators, and dependency graph design.4 lessons
Lesson 1: Airflow ArchitectureExplain scheduler, webserver, metadata database, workers/executors, DAG files, task instances, logs, and UI.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: DAGsCreate DAGs with DAG ID, start date, schedule, catchup, tags, default args, and readable structure.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: Tasks and OperatorsUse task concept, PythonOperator, BashOperator, EmptyOperator, SQL operators conceptually, custom tasks, and task boundaries.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 4: DependenciesDesign upstream/downstream tasks, linear dependencies, branching dependencies, task groups, and readable graph design.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Module 2: Scheduling and ExecutionUse schedules, DAG runs, task instances, states, retries, backfills, catchup, and Airflow UI debugging.5 lessons
Lesson 1: Scheduling ConceptsWrite schedules using schedule intervals, cron expressions, daily/weekly/monthly schedules, manual triggers, data interval, execution date, and timezone awareness.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 2: DAG Runs and Task InstancesInspect DAG run, task instance, states, queued/running/success/failed, retries, skipped tasks, and failures.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: Backfills and CatchupUnderstand catchup behavior, historical runs, backfills, date ranges, risks, and rerun safety.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 4: Airflow UIUse DAG list, grid view, graph view, logs, task duration, run history, manual trigger, and debugging workflow.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 5: Milestone Project 1 - First Scheduled Airflow PipelineBuild a first scheduled Airflow DAG with tasks, dependencies, logs, failure handling, manual trigger support, and README.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Airflow for Production Data PipelinesUse Airflow for real data workflows: extraction, loading, validation, transformation, retries, connections, sensors, branching, structure, logs, alerts, and operational risks.3 modules13 lessons1–2 weeks
Module 1: Building Real Data Workflows with AirflowOrchestrate extraction, loading, validation, and transformation tasks.4 lessons
Lesson 1: Orchestrating Extraction TasksCreate file, API, and database extraction tasks with boundaries, logs, and source metadata.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 2: Orchestrating Loading TasksAdd raw load, staging load, load metadata, file-to-database, database-to-warehouse, and run tracking.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: Orchestrating Validation TasksAdd row count, schema, null, freshness, reconciliation checks, and task failure on quality issues.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 4: Orchestrating Transformation TasksAdd Python, SQL, dbt preview transformations, dependencies, logs, and validation before publishing.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Module 2: Airflow Reliability PatternsUse retries, variables, connections, sensors, branching, and conditional logic.4 lessons
Lesson 1: Retries and Failure HandlingConfigure retries, retry delay, exponential backoff, failure callbacks, permanent/transient failures, and when not to retry.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 2: Parameters, Variables and ConnectionsUse Airflow Variables, Connections, environment config, secrets concept, no hardcoded credentials, and environment separation.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 3: Sensors and External DependenciesUse file sensors, external task sensors, upstream data waiting, sensor risks, timeout settings, and missing data scenarios.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 4: Branching and Conditional LogicUse branching, conditional paths, skip states, quality gates, alert vs stop decisions, and downstream protection.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Module 3: Airflow Project Structure and Deployment AwarenessOrganize DAG projects, improve observability, design alerts, and diagnose operational risks.5 lessons
Lesson 1: Organizing DAG ProjectsStructure DAG folder, plugins, scripts, configs, reusable functions, and avoid giant DAG files.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: Logging and ObservabilityImprove task logs, structured messages, run IDs, row counts, duration, error messages, and traces.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: Notifications and AlertsDesign email alerts, Slack/webhook concepts, failure alerts, SLA miss alerts, escalation, and alert fatigue controls.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 4: Airflow Operational RisksDiagnose scheduler issues, queued tasks, bad DAG imports, timezone issues, overloaded workers, dependency mistakes, and unsafe backfills.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 5: Milestone Project 2 - Airflow ETL Orchestration ProjectBuild an Airflow DAG that orchestrates extraction, raw load, validation, transformation, quality gates, final output, and failure behavior.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - dbt FundamentalsBuild dbt foundations: project structure, models, refs, sources, layered models, staging, intermediate, and marts.2 modules9 lessons1–2 weeks
Module 1: Introduction to dbtUnderstand why dbt exists and how it brings discipline to SQL transformations.4 lessons
Lesson 1: Why dbt ExistsExplain transformation chaos, scattered SQL, no tests, no lineage, no documentation, analytics engineering, and transformation as code.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: dbt Project StructureInspect dbt project, models folder, sources, seeds, snapshots, macros, profiles, and project naming.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: dbt ModelsCreate SQL select models, materializations, tables, views, incremental preview, naming, and dependency clarity.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 4: refs and sourcesUse source definitions, ref, source, dependency graph, lineage, no hardcoded table names, and dependency thinking.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Module 2: Building dbt Model LayersBuild source declarations, staging, intermediate, and mart models.5 lessons
Lesson 1: Source ModelsDeclare sources, freshness concept, source metadata, documentation, and ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: Staging ModelsBuild one-to-one cleanup, renaming, type casting, standardizing, light cleaning, and source traceability.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: Intermediate ModelsCreate reusable joins, business logic, entity models, avoid repeated logic, dependency layering, and modular SQL.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 4: Mart ModelsBuild reporting-ready tables, fact models, dimension models, BI-friendly outputs, and KPI-ready tables.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 5: Milestone Project 3 - dbt Transformation LayerBuild a dbt project with sources, staging, intermediate, mart models, refs, lineage, organization, and README.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - dbt Testing, Documentation and LineageAdd dbt tests, source freshness, model docs, lineage, exposures, standards, and quality reports.2 modules9 lessons1 week
Module 1: dbt TestingBuild generic tests, custom tests, freshness checks, and testing strategies.4 lessons
Lesson 1: Generic TestsUse not_null, unique, accepted_values, relationships, model-level, column-level, and severity.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 2: Custom TestsWrite custom SQL tests for business rules, metrics, reconciliation, thresholds, and test interpretation.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 3: Source FreshnessConfigure freshness, loaded_at, warnings, stale data, trust, and freshness SLAs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 4: Testing StrategyChoose what to test, where to test, staging vs mart tests, warning vs error, avoid overload, protect outputs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Module 2: dbt Documentation and LineageDocument models, generate docs, inspect lineage, define exposures, and create documentation standards.5 lessons
Lesson 1: Model DocumentationWrite model descriptions, column descriptions, definitions, owners, assumptions, examples, and limitations.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: dbt DocsGenerate docs, lineage graph, dependencies, documentation site, trust, and impact analysis.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 3: Exposures and BI ConsumersDocument dashboards as exposures, downstream dependencies, reporting impact, visibility, and BI ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 4: Documentation StandardsCreate naming consistency, metric definitions, source descriptions, ownership, refresh notes, limitations, and consumer notes.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 5: Milestone Project 4 - Tested and Documented dbt ProjectUpgrade the dbt project with tests, freshness, documentation, lineage, generated docs, and a quality report.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Advanced dbt PatternsUse dbt materializations, incremental models, snapshots, selected runs, seeds, macros, packages, and project reviews.2 modules9 lessons1 week
Module 1: Materializations and PerformanceChoose dbt materializations and build incremental and snapshot models.4 lessons
Lesson 1: MaterializationsCompare view, table, incremental, ephemeral, performance tradeoffs, and cost impact.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: Incremental ModelsBuild incremental strategy, unique keys, updated_at filters, merge logic, full refresh, late data, and tests.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 3: SnapshotsUse SCD concepts, check strategy, timestamp strategy, current/historical records, and dimension history.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 4: Model Selection and RunsUse selected model runs, tags, dependencies, state-aware runs high level, build vs run vs test.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Module 2: Macros, Seeds and Reusable PatternsUse seeds, macros, packages, and project reviews to improve reuse and maintainability.5 lessons
Lesson 1: SeedsUse reference data, mapping tables, static datasets, accepted values, and business-controlled lists.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

30 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

22 min

Lesson 2: Macros and Reusable SQLCreate simple macros, DRY SQL, reusable transformations, audit columns, and readability review.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: Packages and ReuseEvaluate dbt packages, common utilities, package discipline, and dependency management.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

30 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

22 min

Lesson 4: dbt Project ReviewReview model structure, naming, tests, docs, materializations, performance, and maintainability.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 5: Milestone Project 5 - Advanced dbt WorkflowUpgrade the dbt project with materialization strategy, incremental model, snapshot/history, seed, macro, selected runs, performance notes, and docs.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

7Phase 7 - Airflow + dbt IntegrationOrchestrate dbt with Airflow and build integrated ELT workflows.2 modules9 lessons1 week
Module 1: Why Orchestrate dbt with Airflow?Understand when dbt alone is enough and when Airflow should coordinate broader workflows.4 lessons
Lesson 1: dbt Alone vs Airflow + dbtCompare dbt transformations, Airflow orchestration, extract/load before dbt, alerts, dependencies, and multi-system pipelines.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: Running dbt from AirflowRun dbt using BashOperator/Python operator approaches, commands, paths, environment config, failures, and logs.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 3: Orchestrating dbt StepsRun dbt run, test, docs generate, source freshness, model selection, task order, and quality gates.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 4: Handling dbt Failures in AirflowHandle failed tests, failed models, alerting, retries, stopping downstream tasks, warnings, and stakeholder impact.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Module 2: End-to-End ELT WorkflowBuild a complete workflow where Airflow extracts/loads data and dbt transforms/tests/publishes outputs.5 lessons
Lesson 1: Extract and Load Before dbtCreate extraction, raw loading, staging availability, handoff to dbt, completion checks, and freshness.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 2: dbt Transformation LayerRun source freshness, staging models, intermediate models, marts, tests, docs, and materializations as part of the workflow.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

37 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

30 min

Lesson 3: Publish and NotifyHandle final mart readiness, dashboard refresh trigger concept, success/failure notifications, consumers, and release notes.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 4: End-to-End TraceabilityTrace through Airflow logs, dbt logs, dbt docs, lineage, run metadata, troubleshooting, and source-to-mart flow.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 5: Milestone Project 6 - Airflow + dbt ELT PipelineBuild an integrated ELT workflow where Airflow extracts/loads and dbt transforms/tests/documents marts.150 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

8Phase 8 - Monitoring, Reliability and OperationsOperate orchestrated data workflows with ownership, triage, runbooks, backfills, monitoring, alerts, change management, and production readiness.2 modules9 lessons1 week
Module 1: Operating Orchestrated Data WorkflowsDefine ownership, triage failures, write runbooks, and plan safe backfills.4 lessons
Lesson 1: Operational OwnershipDefine who owns DAGs, dbt models, failures, consumers, support process, and escalation.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

30 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

22 min

Lesson 2: Failure TriageDiagnose failed extraction, load, dbt model, dbt test, stale source, bad data, dashboard issue, and schema changes.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 3: RunbooksWrite normal run, manual rerun, backfill, failed task recovery, data quality failure, rollback, and alert response instructions.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 4: Backfills and ReprocessingPlan historical DAG runs, dbt full refresh, incremental backfills, partition reloads, risk controls, and validation.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Module 2: Monitoring and Data Quality OperationsDesign monitoring metrics, alerting, change management, and production readiness reviews.5 lessons
Lesson 1: Monitoring MetricsDefine DAG success rate, task duration, dbt test failure rate, source freshness, anomalies, rejected trends, and cost/performance signals.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 2: Alerting StrategyDesign failure, freshness, quality, warning vs critical, escalation routing, and alert fatigue controls.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 3: Change ManagementManage model changes, schema changes, source changes, downstream reports, release notes, stakeholder communication, and versioning.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

32 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

25 min

Lesson 4: Production Readiness ReviewReview scheduling, tests, docs, lineage, alerts, runbook, backfill plan, ownership, and maintainability.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional orchestration workflow.

35 min

Practice Activity

Apply the lesson through a guided orchestration exercise.

27 min

Lesson 5: Mini Project 2 - Data Workflow Operations PackageProduce operations assets for an Airflow + dbt workflow.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

9Phase 9 - CapstoneBuild a production-aware orchestrated data workflow using Airflow and dbt.1 modules3 lessons1–2 weeks
Module 1: Airflow and dbt Orchestration CapstoneStudents build a complete Airflow + dbt ELT workflow with extraction/loading, dbt transformations, tests, docs, lineage, marts, operations package, and presentation.3 lessons
Lesson 1: Capstone OptionsChoose a realistic Airflow and dbt orchestration capstone domain.55 minarticle1 pages

Choose Your Airflow and dbt Orchestration Capstone

Review approved capstone options.

55 min

Lesson 2: Final Capstone - Airflow and dbt Orchestration CapstoneBuild a production-aware orchestrated data workflow using Airflow and dbt.220 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 3: Graduation Requirements and Portfolio OutcomeClarify completion requirements and portfolio assets.55 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio outcomes.

55 min

7Intermediate8 weeksPath onlyCloud Data EngineeringLearn how data engineering works in the cloud, including storage, compute, managed databases, warehouses, pipeline deployment, access control, monitoring, and cost-aware architecture.8 phases19 modules83 lessons245 pages
1Phase 1 - Cloud Data Engineering FoundationsBuild the cloud data engineering mindset: why data engineering moved to the cloud, platform architecture, team roles, core cloud concepts, and service mapping.2 modules9 lessons1–2 weeks
Module 1: Cloud Data Engineering MindsetUnderstand why cloud platforms matter, how cloud data architectures work, who owns what, and what core cloud concepts data engineers must know.4 lessons
Lesson 1: Why Data Engineering Moved to the CloudUnderstand why modern data platforms moved to cloud infrastructure: storage scale, elastic compute, managed services, serverless tools, reliability, global access, and cost tradeoffs.85 minarticle6 pages

Welcome and Learning Objectives

Introduce the reason cloud became central to data engineering.

8 min

From Fixed Infrastructure to Elastic Platforms

Explain elastic cloud infrastructure.

18 min

Managed Services and Serverless Data Tools

Explain managed services and serverless concepts.

18 min

Reliability and Cost Tradeoffs

Explain cloud tradeoffs.

18 min

Cloud Does Not Replace Data Engineering Fundamentals

Connect cloud to previous path courses.

18 min

Exercise - On-Premise vs Cloud Pipeline Comparison

Students compare on-premise and cloud data pipelines.

23 min

Lesson 2: Cloud Data Platform ArchitectureDesign cloud data platform architecture using source systems, ingestion, object storage, data lake, metadata catalog, transformation, warehouse, BI, monitoring, and governance.85 minarticle5 pages

Welcome and Learning Objectives

Introduce cloud data platform architecture.

8 min

Core Architecture Components

Explain architecture components.

20 min

AWS-First Reference Architecture

Show AWS architecture.

20 min

Architecture Translation Across Clouds

Introduce multi-cloud translation.

18 min

Exercise - Cloud Data Platform Diagram

Students draw a cloud data platform.

39 min

Lesson 3: Cloud Data Engineering RolesMap responsibilities across cloud data engineers, analytics engineers, platform engineers, BI engineers, data architects, DevOps, and security collaborators.80 minarticle4 pages

Welcome and Learning Objectives

Introduce cloud data platform roles.

8 min

Core Roles in Cloud Data Platforms

Explain key roles.

24 min

Ownership Across Platform Layers

Map responsibilities.

18 min

Exercise - Cloud Data Platform Responsibility Map

Students map team responsibilities.

30 min

Lesson 4: Core Cloud Concepts for Data EngineersExplain regions, availability zones, managed services, serverless, compute, storage, networking basics, identity, permissions, and billing to non-cloud engineers.85 minarticle5 pages

Welcome and Learning Objectives

Introduce core cloud concepts.

8 min

Regions, Availability and Managed Services

Explain cloud infrastructure basics.

20 min

Compute, Storage, Networking and Identity

Explain core layers.

22 min

Billing Basics for Data Engineers

Explain cost awareness.

18 min

Exercise - Explain Cloud Pipeline Components

Students explain cloud components to a non-cloud engineer.

37 min

Module 2: Cloud Service MappingMap storage, integration, warehouse/query, orchestration, and monitoring services across AWS, Azure/Microsoft Fabric, and Google Cloud.5 lessons
Lesson 1: Storage ServicesMap AWS S3, Azure Data Lake Storage/OneLake, Google Cloud Storage, buckets, containers, prefixes, lifecycle rules, and object metadata.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Data Integration ServicesCompare AWS Glue, Azure Data Factory, Fabric Data Factory, Google Dataflow, managed connectors, transformation engines, and serverless jobs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Warehouse and Query ServicesCompare Amazon Redshift, Athena, BigQuery, Synapse/Fabric Warehouse, serverless query, warehouse compute, and lakehouse patterns.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Orchestration and Monitoring ServicesMap MWAA, Cloud Composer, Data Factory pipelines, EventBridge, Step Functions, CloudWatch-style monitoring, Azure Monitor, and Google Cloud Monitoring.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 5: Mini Project 1 - Cloud Data Platform BlueprintStudents produce a cloud platform blueprint with AWS-first service choices and multi-cloud comparison.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Cloud Storage and Data Lake DesignDesign cloud object storage, data lake zones, file formats, partitions, and AWS S3 implementation/documentation.2 modules9 lessons1–2 weeks
Module 1: Object Storage for Data EngineeringDesign object storage and data lake structures with zones, formats, partitions, and performance tradeoffs.4 lessons
Lesson 1: Object Storage FundamentalsUnderstand buckets, objects, prefixes, metadata, durability, storage classes, lifecycle, and data lake patterns.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Data Lake ZonesDesign landing, raw, staging, cleaned, curated, serving, archive, and quarantine zones.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: File FormatsChoose CSV, JSON, Parquet, Avro concept, compression, row vs columnar storage, performance, and tradeoffs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Partitioning in Data LakesDesign date, source, region, event-type partitions, pruning, depth, bad partitions, and small file controls.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

37 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

30 min

Module 2: Hands-On AWS S3 Data LakeImplement or design an AWS S3 data lake with upload, organization, lifecycle, and documentation.5 lessons
Lesson 1: Creating Data Lake BucketsSet up S3 buckets, naming, prefixes, environments, dev/staging/prod concept, and ownership.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Uploading and Organizing DataUse manual upload, AWS CLI, boto3 basics, batch uploads, source naming, batch dates, and raw evidence preservation.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Lifecycle and RetentionDesign retention, archive storage, deletion rules, cost control, compliance, and temporary data cleanup.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Data Lake DocumentationWrite source inventory, zone definitions, naming standards, partition standards, ownership, retention, and access notes.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 5: Milestone Project 1 - Cloud Data Lake SetupBuild or design an AWS-first data lake with zones, datasets, partitions, formats, lifecycle, access assumptions, and documentation.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Cloud Ingestion and Managed ETLBuild cloud batch ingestion, Python-to-cloud storage workflows, managed ETL with Glue, cataloging, incremental loads, and ETL monitoring.2 modules10 lessons1–2 weeks
Module 1: Cloud Batch IngestionDesign file, Python, API, database, and managed ETL ingestion into cloud storage.4 lessons
Lesson 1: File-Based Cloud IngestionDesign landing zones, file drops, source folders, upload automation, validation, archive patterns, and rejected files.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Python to Cloud StorageUse boto3 upload/download, S3 reads/writes, credentials, environment variables, error handling, and metadata.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

37 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

30 min

Lesson 3: Database and API to Cloud StorageDesign extracts to cloud storage, raw landing files, batch IDs, timestamps, incremental outputs, and secure credentials.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Managed ETL ServicesCompare AWS Glue, Azure Data Factory/Fabric, Google Dataflow, serverless transformation, connectors, and custom Python tradeoffs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Module 2: Cloud ETL with AWS GlueUse AWS Glue concepts, crawlers, cataloging, ETL jobs, incremental logic, and monitoring.6 lessons
Lesson 1: AWS Glue ConceptsExplain Glue jobs, Studio, crawlers, Data Catalog, connections, triggers, bookmarks, and PySpark concept.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Glue Crawlers and CatalogingUse crawlers, schema discovery, tables, partitions, Data Catalog, schema drift risks, and limitations.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Glue ETL JobsDesign source, transform, target, PySpark concept, parameters, output formats, and curated outputs.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

37 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

30 min

Lesson 4: Incremental Loads in GlueDesign job bookmarks, new files, changed data, partitions, incremental processing, and rerun risk controls.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 5: Glue Job MonitoringReview job runs, logs, failures, retries, metrics, debugging, and failed job recovery.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 6: Milestone Project 2 - Cloud Ingestion and ETL PipelineBuild or simulate cloud ingestion, raw landing, cataloging, managed ETL, curated outputs, logs, and documentation.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - Metadata Catalogs and Serverless SQLDesign metadata catalogs, schema evolution handling, partition metadata, governance, serverless SQL, Athena patterns, performance, and quality checks.2 modules9 lessons1 week
Module 1: Metadata Catalogs and SchemasBuild catalog thinking for discoverability, schemas, partitions, ownership, and governance.4 lessons
Lesson 1: Why Catalogs MatterExplain discoverability, schema management, table definitions, partition metadata, query engines, governance, and ownership.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

30 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

22 min

Lesson 2: Schema EvolutionManage new columns, changed types, missing fields, drift, backward compatibility, consumer impact, and breaking changes.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Partition MetadataHandle partition registration, repair, date partitions, partition projection concept, stale and missing partitions.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Data Catalog GovernanceDocument ownership, table descriptions, column descriptions, sensitive labels, access notes, and stewardship.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Module 2: Serverless SQL on Data LakesQuery files with SQL, use Athena-style external tables, optimize serverless queries, and perform lake quality checks.5 lessons
Lesson 1: Querying Files with SQLUse external tables, SQL over object storage, schema-on-read, serverless query engines, lakehouse patterns, and query cost.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Athena FundamentalsCreate databases, tables, external locations, SQL queries, result locations, history, workgroups, and cost awareness.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Serverless Query PerformanceOptimize Parquet, compression, partition pruning, fewer columns, avoiding full scans, result reuse, and small file issues.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Lake Query Quality ChecksWrite row count, null, duplicate, partition, freshness, and relationship checks using serverless SQL.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 5: Milestone Project 3 - Serverless Data Lake AnalyticsCreate cataloged lake tables, serverless SQL queries, quality checks, cost/performance notes, and BI-ready outputs.130 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - Cloud Data Warehousing and PerformanceDesign cloud warehouses, compare platforms, load lake data, build schemas/marts, use Redshift concepts, optimize performance, and control cost.3 modules13 lessons1–2 weeks
Module 1: Cloud Warehouse FoundationsUnderstand cloud warehouse architecture, platform choices, loading, and schema design.4 lessons
Lesson 1: What Makes Cloud Warehouses DifferentExplain elastic compute, storage/compute separation, serverless options, concurrency, scaling, managed operations, and billing patterns.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

30 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

22 min

Lesson 2: Warehouse Platform ComparisonCompare Amazon Redshift, BigQuery, Azure Synapse/Fabric Warehouse, Snowflake optional, workload fit, pricing, and ecosystem.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Loading Data into WarehousesDesign bulk loading, COPY-style commands, external tables, staged files, incremental loads, data types, and metadata.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Warehouse Schema DesignDesign datasets/schemas, facts/dimensions, marts, partitions, clustering/sort keys, naming, and access boundaries.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

37 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

30 min

Module 2: AWS Redshift and Warehouse PatternsUse Redshift concepts, lakehouse/external tables, warehouse transformations, and reconciliation.4 lessons
Lesson 1: Redshift ConceptsUnderstand clusters/serverless concept, databases, schemas, tables, COPY from S3, distribution/sort concepts, and workload management high level.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Lakehouse and External Table PatternsChoose query-in-lake vs load-into-warehouse, external schemas, lakehouse concepts, materialization tradeoffs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Cloud Warehouse TransformationsBuild staging, dimensional tables, marts, aggregate tables, incremental transformations, and materialized outputs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Warehouse Quality and ReconciliationValidate source-to-target counts, financial totals, duplicates, freshness, BI outputs, and reconciliation reports.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Module 3: Cloud Warehouse Performance and CostOptimize cloud warehouse performance and cost through partitioning, clustering, materialization, and cost governance.5 lessons
Lesson 1: Query Performance in Cloud WarehousesReview large scans, joins, aggregations, sorting, filters, concurrency, and expensive dashboards.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Partitioning, Clustering and SortingChoose partitioning, clustering, sort keys, distribution concepts, pruning, data skipping, and cloud-specific differences.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Materialization and Pre-AggregationChoose views, tables, materialized views, aggregate tables, dashboard acceleration, and refresh cost.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Warehouse Cost ControlControl compute, storage, query scans, concurrency, scheduling, result caching, wide tables, and retention.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 5: Milestone Project 4 - Cloud Warehouse MartBuild or design warehouse schemas, marts, validations, performance strategy, cost notes, and BI-ready outputs.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Streaming and Event-Driven Data PipelinesUnderstand batch vs streaming, event-driven architecture, streaming services, failure modes, AWS/GCP/Azure patterns, and monitoring.2 modules9 lessons1–2 weeks
Module 1: Streaming FoundationsUnderstand event data, streaming architecture, cloud streaming services, and streaming risks.4 lessons
Lesson 1: Batch vs Streaming RevisitedClassify event data, near-real-time analytics, queues, streams, latency requirements, cost/complexity, and when not to stream.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

30 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

22 min

Lesson 2: Event-Driven ArchitectureDesign producers, topics/streams, consumers, messages, event schemas, delivery guarantees high level, and replay.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Streaming Ingestion ServicesMap AWS Kinesis, Firehose, Azure Event Hubs, Google Pub/Sub, managed ingestion, sinks, and service comparison.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Streaming Data Quality RisksHandle duplicates, late events, out-of-order events, schema changes, poison messages, dead-letter queues, and replay.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Module 2: Streaming Pipeline DesignDesign AWS, GCP, and Azure streaming patterns and monitoring metrics.5 lessons
Lesson 1: AWS Streaming PatternDesign producer, Kinesis/Firehose concept, S3 landing, Glue catalog, Athena query, and Redshift load concept.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 2: Google Streaming PatternCompare Pub/Sub, Dataflow, BigQuery, Cloud Storage, templates, and streaming analytics pattern.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Azure Streaming PatternCompare Event Hubs, Data Factory/Fabric concepts, Synapse/Fabric/Power BI integration, and stream processing concepts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Streaming MonitoringDefine lag, failed events, throughput, error rates, dead-letter queues, replay strategy, and consumer failures.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 5: Mini Project 2 - Streaming Architecture DesignStudents design a streaming architecture for a realistic event use case.120 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

7Phase 7 - Security, Governance, Monitoring and Cost ControlSecure, govern, monitor, operate, and control cost in cloud data platforms.4 modules17 lessons1–2 weeks
Module 1: Cloud Security for Data EngineersApply IAM, access control, encryption, secrets, privacy, and sensitive data protections.4 lessons
Lesson 1: IAM FundamentalsDesign users, roles, policies, least privilege, service roles, temporary credentials, boundaries, and human/service access.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: Data Access ControlDesign bucket policies, object permissions, warehouse permissions, table/column/row access, and environment isolation.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 3: Encryption and SecretsHandle encryption at rest/in transit, KMS concept, secrets managers, environment variables, rotation, and leaks.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Privacy and Sensitive DataProtect PII using masking, anonymization, retention, deletion, audit logs, sensitive marts, and least-data principle.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Module 2: Governance and Data ContractsDesign governance ownership, data contracts, retention, lifecycle rules, and audit evidence.4 lessons
Lesson 1: Cloud Data GovernanceDefine ownership, stewardship, data definitions, catalog responsibilities, quality ownership, and access approval.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

30 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

22 min

Lesson 2: Data Contracts in Cloud PipelinesWrite schema expectations, producer/consumer agreements, breaking changes, contract testing, ownership, and alerts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Retention and Lifecycle GovernanceCreate retention, archive, deletion, compliance, temporary expiry, cost, and governance policies.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: AuditabilityDesign access logs, job logs, load metadata, change history, lineage, and incident evidence.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Module 3: Monitoring and OperationsDefine logs, metrics, monitoring, incident response, and runbooks.4 lessons
Lesson 1: Cloud Logging and MetricsDefine job logs, query logs, pipeline logs, service metrics, dashboards, alerts, and retention.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

30 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

22 min

Lesson 2: Pipeline MonitoringMonitor freshness, run success, row counts, schema drift, failed jobs, quality failures, and query failures.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Incident ResponseRespond to failed ETL jobs, bad published data, access leaks, cost spikes, outages, failed loads, and recovery.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: RunbooksWrite normal operation, failed runs, reruns, backfills, access requests, troubleshooting, and escalation steps.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Module 4: Cost Engineering / FinOps for DataControl storage, query, ETL, warehouse, streaming, logs, transfer, and idle resource costs.5 lessons
Lesson 1: Cloud Data Cost DriversIdentify storage, query scans, ETL compute, warehouse compute, streaming ingestion, logs, transfer, and idle resources.60 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

30 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

22 min

Lesson 2: Storage Cost ControlUse lifecycle policies, compression, retention, archive tiers, temp deletion, small file control, and partitions.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Query and Warehouse Cost ControlOptimize partitions, columnar formats, query limits, pre-aggregation, sizing, scheduling, and dashboards.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 4: Cost Monitoring and BudgetsUse budgets, alerts, tagging, allocation, reports, chargeback/showback concept, and cost review cadence.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 5: Milestone Project 5 - Cloud Data Operations PackageProduce security, governance, monitoring, incident response, runbook, and cost-control assets.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

8Phase 8 - Multi-Cloud Translation and CapstoneTranslate AWS architectures to Azure and Google Cloud, choose a cloud strategy, prepare career answers, and complete the final capstone.2 modules7 lessons1–2 weeks
Module 1: Multi-Cloud TranslationTranslate AWS-first cloud data architectures into Azure/Microsoft Fabric and Google Cloud equivalents.4 lessons
Lesson 1: AWS to Azure TranslationTranslate S3 to ADLS/OneLake, Glue to Data Factory/Fabric, Redshift/Athena to Synapse/Fabric/SQL endpoints, IAM to Azure RBAC, CloudWatch to Azure Monitor, and Kinesis to Event Hubs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 2: AWS to Google Cloud TranslationTranslate S3 to Cloud Storage, Glue/Dataflow, Athena/Redshift to BigQuery, Kinesis to Pub/Sub, MWAA to Composer, CloudWatch to Cloud Monitoring.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

32 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

25 min

Lesson 3: Choosing a Cloud StrategyEvaluate company ecosystem, tools, talent, cost, compliance, BI integration, credits, maturity, and hiring market.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Lesson 4: Cloud Data Engineering Career ReadinessPrepare for portfolio, certifications awareness, interview scenarios, architecture explanation, cost/security questions, and production thinking.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a professional cloud platform workflow.

35 min

Practice Activity

Apply the lesson through a guided cloud data engineering exercise.

27 min

Module 2: Cloud Data Engineering CapstoneStudents design and build a cloud data platform MVP using an AWS-first implementation with Azure and Google Cloud translation notes.3 lessons
Lesson 1: Capstone OptionsChoose a realistic cloud data engineering capstone domain.55 minarticle1 pages

Choose Your Cloud Data Engineering Capstone

Review approved capstone options.

55 min

Lesson 2: Final Capstone - Cloud Data Engineering CapstoneStudents design and build a cloud data platform MVP using AWS-first implementation with Azure and Google Cloud translation notes.220 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 3: Graduation Requirements and Portfolio OutcomeClarify completion requirements and portfolio outcomes.55 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio assets.

55 min

8Intermediate8 weeksPath onlyData Engineering StudioApply Python, SQL, data modeling, warehousing, pipelines, orchestration, dbt, and cloud workflows to build portfolio-ready data engineering projects.8 phases17 modules73 lessons213 pages
1Phase 1 - Capstone Discovery and Data Platform ScopeChoose and scope the right capstone problem before building: domain, sources, consumers, MVP boundaries, non-goals, risks, and concept brief.1 modules5 lessons1 week
Module 1: Choosing the Right Data Engineering ProblemSelect a realistic, valuable, and portfolio-worthy data engineering capstone problem.5 lessons
Lesson 1: What Makes a Strong Data Engineering Capstone?Learn what separates a serious data engineering capstone from a small script project: realistic sources, downstream consumers, warehouse outputs, quality risks, operations, and portfolio strength.90 minarticle5 pages

Welcome and Learning Objectives

Introduce the standard for a strong studio capstone.

8 min

Signals of a Strong Capstone

Explain the core ingredients.

20 min

Realism, Difficulty and Business Value

Teach how to score project ideas.

18 min

Examples of Strong Capstone Directions

Give realistic domain examples.

18 min

Exercise - Capstone Idea Ranking

Students review and score project ideas.

26 min

Lesson 2: Avoiding Weak Data Engineering ProjectsIdentify weak capstone ideas and upgrade them into stronger platform projects with validation, orchestration, modeling, incremental logic, consumers, monitoring, and documentation.85 minarticle4 pages

Welcome and Learning Objectives

Introduce weak project patterns.

8 min

Weak Project Patterns

Explain common weak capstone mistakes.

20 min

How to Upgrade a Weak Project

Show upgrade strategy.

22 min

Exercise - Weak Project Rewrite

Students rewrite weak pipeline ideas into stronger platform projects.

35 min

Lesson 3: Choosing a Business DomainChoose a capstone domain and identify realistic sources, consumers, outputs, quality risks, and platform scope.85 minarticle4 pages

Welcome and Learning Objectives

Introduce domain choice.

8 min

Approved Studio Domains

List and explain domain options.

22 min

Domain Selection Criteria

Explain how to choose.

20 min

Exercise - Domain Selection Brief

Students choose a capstone domain and identify sources and consumers.

35 min

Lesson 4: Defining Scope and Non-GoalsDefine MVP scope, must-have pipelines, nice-to-have features, simulations, build decisions, documentation scope, and overbuild boundaries.85 minarticle4 pages

Welcome and Learning Objectives

Introduce scope control.

8 min

MVP Scope for a Data Platform

Explain MVP thinking.

20 min

Must-Haves, Nice-to-Haves and Non-Goals

Teach scope categories.

22 min

Exercise - Capstone v1 Scope and Non-Goals

Students define scope and non-goals.

35 min

Lesson 5: Mini Project 1 - Data Platform Concept BriefStudents define their capstone concept, business value, scope, risks, sources, consumers, and expected outputs.110 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

2Phase 2 - Requirements, Architecture and Data ModelingConvert the capstone idea into requirements, architecture, tooling decisions, source models, warehouse layers, dimensional models, marts, KPIs, and risk register.3 modules13 lessons1–2 weeks
Module 1: Data Platform RequirementsCapture business, source, quality, and operational requirements for the capstone platform.4 lessons
Lesson 1: Business and Analytics RequirementsDefine stakeholders, BI requirements, analytics requirements, AI/ML downstream needs, freshness, granularity, and metrics.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Source System RequirementsDocument source ownership, schema expectations, data contracts, extraction frequency, reliability, drift risk, and access.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 3: Data Quality RequirementsDefine completeness, validity, uniqueness, freshness, consistency, reconciliation, accepted values, and relationships.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 4: Operational RequirementsDefine schedule, retries, monitoring, alerts, backfills, reruns, support ownership, and incident response.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 2: Architecture DesignDesign source-to-consumer architecture, architecture style, tooling matrix, and environment strategy.4 lessons
Lesson 1: Source-to-Consumer ArchitectureDraw source systems, ingestion, raw storage, staging, transformations, warehouse, marts, consumers, and monitoring.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 2: Lake, Warehouse, or Hybrid ArchitectureChoose data lake, warehouse, lakehouse concept, serverless SQL, marts, and cost/complexity tradeoffs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 3: Tooling DecisionsChoose Python, SQL, dbt, Airflow, cloud storage, serverless query, warehouse, BI output, and monitoring tools.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 4: Environment and Deployment DesignPlan local dev, cloud dev, staging/production concepts, environment variables, secrets, service accounts, and reproducible setup.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 3: Data Modeling BlueprintCreate source models, warehouse layer models, dimensional models, and marts/KPI designs.5 lessons
Lesson 1: Source ModelingDocument source tables, files, APIs, schemas, keys, relationships, and source limitations.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Warehouse Layer ModelingMap raw, staging, intermediate, curated, mart layers, naming standards, and lineage.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 3: Dimensional ModelingDesign facts, dimensions, grain, conformed dimensions, date dimensions, factless facts, and snapshots.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 4: Metric and Mart DesignDefine KPI definitions, aggregate tables, reporting marts, BI-ready outputs, consistency, and dashboard consumers.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 5: Milestone Project 1 - Data Platform BlueprintProduce the complete blueprint for the capstone platform.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

3Phase 3 - Build Sprint 1: Ingestion and Raw LayerSet up the repository, simulate sources, build ingestion pipelines, load raw data, capture metadata, and demonstrate ingestion evidence.2 modules9 lessons1–2 weeks
Module 1: Project Setup and Source SimulationPrepare the repository, source datasets, configuration, secrets, and sprint plan.4 lessons
Lesson 1: Repository SetupSet up project structure, src, dags, dbt project, configs, tests, docs, samples, and logs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Source Data PreparationPrepare sample files, simulated API, source database, event logs, seed data, and source realism.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 3: Configuration and SecretsUse config files, .env, credentials, local vs cloud config, safe defaults, and secret safety.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 4: Build Plan and Sprint BoardCreate build milestones, tasks, dependencies, blockers, review points, and acceptance criteria.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 2: Ingestion LayerBuild file, API, database, and raw loading workflows with metadata, errors, and logs.5 lessons
Lesson 1: File IngestionImplement discovery, naming patterns, batch dates, archive folders, bad file handling, and source metadata.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 2: API or Simulated API IngestionImplement REST extraction, pagination, retries, rate limits, incremental extraction, and API logs.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 3: Database ExtractionImplement source database connection, query extraction, incremental extraction, chunked reads, and connection error handling.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 4: Raw Layer LoadingLoad raw files/tables using batch ID, loaded_at, source tracking, and immutable raw concept.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 5: Milestone Project 2 - Ingestion and Raw Layer DemoDemonstrate ingestion from multiple sources and raw layer loading with logs and metadata.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

4Phase 4 - Build Sprint 2: Staging, Transformations and Warehouse ModelsBuild staging models, validation, rejected record handling, intermediate models, facts/dimensions, marts, and lineage.2 modules9 lessons1–2 weeks
Module 1: Staging LayerCreate source-to-staging models, schema checks, rejected record handling, and staging documentation.4 lessons
Lesson 1: Source-to-Staging ModelsStandardize names, cast types, perform light cleaning, preserve traceability, and follow staging naming conventions.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 2: Schema and Contract ChecksCheck required columns, expected types, missing columns, schema drift, and source contract violations.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 3: Rejected RecordsHandle invalid rows, rejection reasons, quarantine table/file, bad data review, and reprocessing.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 4: Staging DocumentationDocument source mapping, column dictionary, source assumptions, known issues, and owner notes.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 2: Transformation LayerBuild intermediate models, dimensional models, marts, dependencies, and lineage.5 lessons
Lesson 1: Intermediate ModelsBuild reusable joins, business logic, entity models, event models, and avoid repeated SQL.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 2: Dimensional ModelsBuild fact tables, dimension tables, grain, surrogate keys, date dimension, and conformed dimensions.80 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

40 min

Practice Activity

Apply the lesson through a guided studio activity.

32 min

Lesson 3: Mart ModelsBuild BI-ready tables, KPI tables, aggregate tables, friendly columns, and dashboard consumption outputs.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 4: Lineage and Model DependenciesCreate source-to-target mapping, dbt refs, dependency graph, model order, and downstream impact notes.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 5: Milestone Project 3 - Warehouse Model and Mart DemoDemonstrate staging, intermediate, dimensional, and mart outputs with lineage and queries.150 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

5Phase 5 - Build Sprint 3: Quality, Incremental Logic and OrchestrationAdd data quality tests, reconciliation, quality reports, incremental logic, safe reruns, backfills, and Airflow orchestration.3 modules13 lessons1–2 weeks
Module 1: Data Quality LayerDesign and implement data quality tests, reconciliation, and quality reporting.4 lessons
Lesson 1: Quality Test DesignDesign uniqueness, not-null, accepted values, relationships, row counts, freshness, and reconciliation tests.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 2: dbt Tests and Custom SQL TestsImplement generic tests, custom tests, severity, warnings vs errors, and test documentation.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 3: ReconciliationReconcile source totals, staging totals, mart totals, financial totals, thresholds, and mismatches.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 4: Data Quality ReportingGenerate quality summary, failed tests, warnings, rejected records, run-level metrics, and stakeholder visibility.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Module 2: Incremental Loads, Backfills and RerunsImplement incremental strategy, safe reruns, and backfill planning.4 lessons
Lesson 1: Incremental StrategyChoose append-only, upsert, merge, high-watermark, partition reload, and full-refresh fallback strategies.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 2: Implementing Incremental ModelsImplement updated_at filters, unique keys, incremental dbt models, source state tracking, and late-arriving data handling.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 3: Safe RerunsDesign idempotency, rerunning failed batches, duplicate prevention, temp/staging swap, and batch IDs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 4: Backfill PlanPlan historical reload, date windows, dependency order, validation after backfill, and rollback.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Module 3: Airflow OrchestrationCreate DAG design, ingestion tasks, dbt orchestration, notifications, and run tracking.5 lessons
Lesson 1: DAG DesignDesign task boundaries, dependencies, scheduling, retries, task groups, and quality gates.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 2: Orchestrating Ingestion and LoadingAdd extract tasks, raw load tasks, metadata logging, failure handling, and retries.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 3: Orchestrating dbtAdd dbt run, dbt test, dbt docs, source freshness, and failure behavior.75 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

37 min

Practice Activity

Apply the lesson through a guided studio activity.

30 min

Lesson 4: Notifications and Run TrackingAdd success alerts, failure alerts, run summary, task logs, metadata table, and operational visibility.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 5: Milestone Project 4 - Orchestrated Quality-Controlled PipelineDemonstrate quality-controlled, incremental, orchestrated pipeline behavior.150 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

6Phase 6 - Cloud, Security, Monitoring and OperationsDesign cloud deployment, multi-cloud translation, access/security, governance, monitoring, incidents, cost review, and runbook.3 modules13 lessons1–2 weeks
Module 1: Cloud Deployment DesignMap local platform design to cloud storage, query, warehouse, deployment, and multi-cloud architecture.4 lessons
Lesson 1: Cloud Storage LayoutDesign raw, staging, curated, archive zones, partitions, and lifecycle rules.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Cloud Warehouse / Query LayerMap warehouse schemas, external tables, serverless SQL, marts, BI access, and compute/cost tradeoffs.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 3: Deployment PlanCreate infrastructure choices, environment variables, secrets, service accounts, network considerations, and managed services.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 4: Multi-Cloud TranslationTranslate AWS implementation, Azure equivalent, GCP equivalent, tradeoffs, and employer ecosystem.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 2: Security and GovernanceCreate access control, sensitive data handling, governance ownership, and risk notes.4 lessons
Lesson 1: Access ControlDesign least privilege, role-based access, bucket/table permissions, department access, and service accounts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Sensitive Data HandlingIdentify PII, masking, anonymization, retention, deletion, and audit logs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 3: Data GovernanceDefine ownership, data definitions, approval process, quality responsibility, documentation standards, and data contracts.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 4: Compliance and Risk NotesWrite privacy expectations, auditability, retention, source agreements, user/customer data risk, and operational risk.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 3: Monitoring and Cost ControlDefine monitoring metrics, incident response, cost review, and operations handoff.5 lessons
Lesson 1: Monitoring MetricsTrack pipeline success, freshness, row count anomalies, quality failures, duration, volume, and cost.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Alerts and Incident ResponseRespond to failed pipeline, stale mart, failed checks, bad data, cost spike, and access issue.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 3: Cost ReviewEstimate storage, compute, query, orchestration, log costs, and optimization options.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 4: Runbook and Operations HandoffDocument normal run, failed run, manual rerun, backfill, access request, troubleshooting, and ownership.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 5: Milestone Project 5 - Production Readiness ReviewProduce a production readiness package for cloud, security, monitoring, cost, and operations.140 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

7Phase 7 - BI, Analytics, AI Readiness and Portfolio PackagingPrepare downstream consumer outputs, documentation, case study, technical evidence, defense answers, and final demo.2 modules8 lessons1 week
Module 1: Serving Downstream ConsumersValidate BI, analytics, data science, AI readiness, and consumer-facing documentation.4 lessons
Lesson 1: BI ReadinessValidate Power BI-ready marts, semantic considerations, dashboard queries, metrics, date dimensions, and friendly columns.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 2: Analytics ReadinessSupport self-service, documented datasets, clean joins, query examples, safe filters, and known limitations.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 3: Data Science and AI ReadinessIdentify feature-ready datasets, RAG/AI preparation, event data, history, quality expectations, and lineage.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 4: Consumer DocumentationWrite table guide, KPI guide, query examples, dashboard notes, freshness notes, and caveats.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Module 2: Portfolio Case Study and Technical DefensePrepare the case study, evidence, technical defense, and final demo.4 lessons
Lesson 1: Writing the Case StudyDraft problem, architecture, sources, transformations, orchestration, checks, cloud plan, outcomes, and limitations.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 2: Showing Technical EvidenceSelect architecture diagrams, DAG screenshots, dbt lineage, SQL models, quality report, logs, and outputs.65 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

32 min

Practice Activity

Apply the lesson through a guided studio activity.

25 min

Lesson 3: Technical Review PreparationPrepare answers on architecture, model choices, reruns, data quality, failures, and cost controls.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

Lesson 4: Final Demo PreparationRehearse demo flow, repo walkthrough, DAG walkthrough, dbt walkthrough, marts, operations package, and clarity.70 minarticle3 pages

Overview and Learning Objectives

Introduce the lesson and clarify expected outcomes.

8 min

Concepts and Professional Workflow

Explain the concept through a capstone delivery workflow.

35 min

Practice Activity

Apply the lesson through a guided studio activity.

27 min

8Phase 8 - Final Capstone and Technical DefenseBuild and defend a complete data platform MVP combining ingestion, warehousing, transformations, orchestration, quality, monitoring, cloud readiness, documentation, and downstream consumption.1 modules3 lessons1 week
Module 1: Data Engineering CapstoneComplete and present the final end-to-end data engineering capstone.3 lessons
Lesson 1: Capstone OptionsChoose or confirm a serious final data platform option.60 minarticle1 pages

Choose Your Data Engineering Capstone

Review approved capstone options.

60 min

Lesson 2: Final Project - Data Engineering CapstoneStudents build and defend a complete data platform MVP.260 minarticle2 pages

Project Brief

Explain the project scenario and expected output.

20 min

Review Checklist

Checklist for project quality.

20 min

Lesson 3: Graduation Requirements and Portfolio OutcomeClarify studio completion requirements and portfolio assets.60 minarticle1 pages

Requirements and Portfolio Checklist

Summarize graduation requirements and portfolio outcomes.

60 min

Skills and tools

Tools are taught through projects, not isolated checklists.

Use Python to automate data movement and processing.Write advanced SQL for transformation and data workflows.Design clean data models for analytics and reporting.Understand how data warehouses are structured.Build ETL and ELT pipelines from raw data to trusted outputs.Orchestrate data workflows with scheduling and dependency thinking.Apply dbt-style transformation, testing, and documentation workflows.Understand cloud data engineering concepts and deployment thinking.Build portfolio-ready data engineering projects.Prepare for junior data engineering roles.
Projects and portfolio

Use Python to automate data movement and processing.

Use Python to automate data movement and processing.

Write advanced SQL for transformation and data workflows.

Write advanced SQL for transformation and data workflows.

Design clean data models for analytics and reporting.

Design clean data models for analytics and reporting.

Understand how data warehouses are structured.

Understand how data warehouses are structured.

Build ETL and ELT pipelines from raw data to trusted outputs.

Build ETL and ELT pipelines from raw data to trusted outputs.

Orchestrate data workflows with scheduling and dependency thinking.

Orchestrate data workflows with scheduling and dependency thinking.

Apply dbt-style transformation, testing, and documentation workflows.

Apply dbt-style transformation, testing, and documentation workflows.

Understand cloud data engineering concepts and deployment thinking.

Understand cloud data engineering concepts and deployment thinking.

Build portfolio-ready data engineering projects.

Build portfolio-ready data engineering projects.

Prepare for junior data engineering roles.

Prepare for junior data engineering roles.

Portfolio outcomes

Use Python to automate data movement and processing.Write advanced SQL for transformation and data workflows.Design clean data models for analytics and reporting.Understand how data warehouses are structured.Build ETL and ELT pipelines from raw data to trusted outputs.Orchestrate data workflows with scheduling and dependency thinking.Apply dbt-style transformation, testing, and documentation workflows.Understand cloud data engineering concepts and deployment thinking.Build portfolio-ready data engineering projects.Prepare for junior data engineering roles.
Mentorship

Self-paced learning with feedback options.

TechOga paths are structured for independent progress, with stronger feedback loops available through weekly live-session and premium one-on-one support.

Structured course access for learners who can move independently and want clear lessons, resources, exercises, and portfolio direction.

A stronger support model with weekly instructor-led live sessions, weekly exercises, instructor reviews, and accountability across a path.

Self-paced access plus premium one-on-one sessions/mentorship for learners who want deeper review, private guidance, career assets, and tailored accountability.

Support level

Self-Paced Only

₦490,000

Pay once or split into fixed installments.

Upfront Payment

₦490,000due today

  • ₦490,000 at enrollment

Two-Part Installment

₦294,000due today

  • ₦294,000 at enrollment
  • ₦196,000 after 30 days

Access starts after your first confirmed payment.

Most guided

Self-Paced + Weekly Instructor-Led Live Sessions

₦690,000

Pay once or split into fixed installments.

Upfront Payment

₦450,000due today

  • ₦450,000 at enrollment

Two-Part Installment

₦270,000due today

  • ₦270,000 at enrollment
  • ₦180,000 after 30 days

Three-Part Installment

₦180,000due today

  • ₦180,000 at enrollment
  • ₦135,000 after 30 days
  • ₦135,000 after 60 days

Access starts after your first confirmed payment.

Support level

Self-Paced + Premium One-on-One Sessions/Mentorship

₦920,000

Pay once or split into fixed installments.

Upfront Payment

₦920,000due today

  • ₦920,000 at enrollment

Two-Part Installment

₦552,000due today

  • ₦552,000 at enrollment
  • ₦368,000 after 30 days

Three-Part Installment

₦368,000due today

  • ₦368,000 at enrollment
  • ₦276,000 after 30 days
  • ₦276,000 after 60 days

Access starts after your first confirmed payment.

FAQ

Questions about this path.

Path-specific answers keep the enrolment decision practical.

Yes. Data analytics focuses on insight and reporting. Data engineering focuses on building the pipelines, warehouses, models, and workflows that make reliable analysis possible.
Start with structure

Become the engineer behind reliable analytics and AI data systems.

Build the skills to move, clean, model, transform, orchestrate, and deliver data that teams can trust for dashboards, reporting, machine learning, and business decisions.

Support model guide