Skip to main content

Introduction to Data Warehouses

A data warehouse is a system specifically designed for reporting and analysis. Unlike an operational database (OLTP) optimized for fast reads and writes of individual records, a warehouse is built for scanning millions of rows and aggregating results quickly.

Placeholder Lesson

This lesson is a work in progress. Content will be expanded with architecture diagrams, hands-on labs, and a video walkthrough.

OLTP vs OLAP

PropertyOLTP DatabaseData Warehouse (OLAP)
PurposeRun the applicationAnalyze the business
Query typeSingle-row lookupsFull-table scans, aggregations
SchemaNormalized (3NF)Denormalized (star schema)
Update frequencyContinuousBatch / streaming
ExamplesPostgreSQL, MySQLSnowflake, BigQuery, Redshift

The Modern Cloud Data Warehouse

Cloud warehouses separate compute from storage. This means you only pay for the queries you run, and you can scale compute independently of how much data you store.

Popular options:

  • Snowflake — multi-cloud, virtual warehouses, time travel
  • BigQuery — serverless, Google Cloud native, per-query pricing
  • Amazon Redshift — provisioned or serverless, AWS native
  • DuckDB — embedded, local-first, great for development and small datasets

Dimensional Modeling Basics

The most common warehouse schema pattern is the star schema:

dim_customers
|
dim_dates — fact_orders — dim_products
|
dim_channels
  • Fact tables store measurable events (orders, sessions, clicks)
  • Dimension tables store descriptive attributes (customer name, product category)

What's Next

In the next lesson you'll choose a warehouse, connect to it, and run your first analytical query.