# Curriculum Guide

This curriculum is designed to move a learner from **zero SQL confidence** to **expert-level analytical maturity**.

The progression is intentional:
- simple
- guided
- practical
- cumulative
- increasingly analytical

---

# Stage 1 — Foundations

## Module 1: SQL Mindset and Relational Thinking
**Goal:** Build conceptual clarity before syntax overload.

### Learner should understand
- what SQL is
- where SQL fits in analytics and data science
- the meaning of table, row, column, record, field
- keys and relationships
- how to inspect a schema

### Why this module matters
If learners skip this conceptual layer, joins and grouped analysis become mechanical memorisation instead of understanding.

---

## Module 2: SELECT, LIMIT, ORDER BY, and Aliases
**Goal:** Retrieve data cleanly and readably.

### Topics
- SELECT column(s)
- FROM table
- LIMIT
- ORDER BY ASC and DESC
- column aliases
- avoiding overuse of `SELECT *`

### Outcome
The learner can inspect data and create basic readable outputs.

---

## Module 3: Filtering with WHERE
**Goal:** Ask focused questions of a dataset.

### Topics
- comparison operators
- AND / OR
- BETWEEN
- IN
- LIKE
- NULL checks
- date filters

### Outcome
The learner can narrow data to relevant records.

---

## Module 4: Calculated Columns, CASE Logic, and NULL Safety
**Goal:** Transform raw data into analytical output.

### Topics
- arithmetic in SQL
- derived columns
- CASE WHEN
- COALESCE
- simple categorisation logic
- rounding

### Outcome
The learner can add business meaning to raw data.

---

# Stage 2 — Analyst Core

## Module 5: Aggregation
**Goal:** Move from row-level inspection to summary-level analysis.

### Topics
- COUNT
- SUM
- AVG
- MIN
- MAX
- row-level versus summary-level thinking

### Outcome
The learner can answer “how many,” “how much,” and “what is the average” questions.

---

## Module 6: GROUP BY and HAVING
**Goal:** Compare segments and categories.

### Topics
- grouping by one column
- grouping by multiple columns
- HAVING versus WHERE
- grouped metrics for reports

### Outcome
The learner can compare categories, cohorts, channels, and departments.

---

## Module 7: Joins for Real Analysis
**Goal:** Connect related tables correctly.

### Topics
- INNER JOIN
- LEFT JOIN
- join keys
- grain awareness
- duplicate risk in joins
- multi-table joins
- self joins

### Outcome
The learner can combine datasets into one analytical result.

---

## Module 8: String, Date, and Numeric Functions
**Goal:** Clean and standardise values.

### Topics
- TRIM
- UPPER / LOWER
- LENGTH
- concatenation
- date extraction
- numeric formatting

### Outcome
The learner can prepare more analysis-ready fields directly in SQL.

---

# Stage 3 — Data Science SQL

## Module 9: Subqueries and CTEs
**Goal:** Handle multi-step analytical thinking.

### Topics
- subqueries in WHERE
- subqueries in FROM
- nested logic
- Common Table Expressions
- readable multi-step query design

### Outcome
The learner can break a complex problem into smaller logical steps.

---

## Module 10: Window Functions and Ranking
**Goal:** Perform advanced row-aware calculations.

### Topics
- ROW_NUMBER
- RANK
- DENSE_RANK
- PARTITION BY
- ORDER BY inside windows
- LAG / LEAD
- running totals

### Outcome
The learner can rank, compare, and track trends without collapsing detail rows.

---

## Module 11: Data Cleaning and Quality Checks in SQL
**Goal:** Improve reliability before reporting or modelling.

### Topics
- duplicate detection
- missing-value checks
- suspicious values
- audit summaries
- quality flags with CASE
- support-risk detection queries

### Outcome
The learner stops trusting data blindly.

---

## Module 12: Exploratory Data Analysis with SQL
**Goal:** Use SQL to investigate rather than merely report.

### Topics
- segment analysis
- cohort analysis
- trend analysis
- high versus low performers
- outlier exploration
- comparing business or learner groups

### Outcome
The learner can use SQL as an exploratory reasoning tool.

---

# Stage 4 — Advanced Analytics Engineering

## Module 13: Feature Engineering with SQL for Machine Learning
**Goal:** Connect SQL directly to model-building workflows.

### Topics
- entity-level feature tables
- recency, frequency, averages, ratios
- behaviour-based features
- wide-table preparation for ML pipelines

### Outcome
The learner can create model-ready variables using SQL.

---

## Module 14: Query Optimisation and Analytics Engineering Foundations
**Goal:** Move from working SQL to maintainable SQL.

### Topics
- query readability
- CTE structure
- indexing concepts
- analytical grain
- fact versus dimension thinking
- naming and documentation

### Outcome
The learner writes cleaner, more maintainable, more scalable SQL.

---

## Module 15: Capstone Labs, Portfolio Projects, and Interview Readiness
**Goal:** Convert learning into evidence.

### Topics
- scenario-based mini-projects
- stakeholder-style explanation
- dashboard-support queries
- interview-style problems
- portfolio packaging

### Outcome
The learner can demonstrate SQL skill publicly and professionally.

---

# Suggested teaching sequence

## Beginner pacing
- 2 to 3 modules per week
- daily short practice
- one review day every week

## Intermediate pacing
- one major module every 4 to 5 days
- stronger focus on joins and grouped analysis
- repeated practice using the built-in query packs

## Advanced pacing
- project-first learning
- challenge completion
- business-question translation practice
- explanation of SQL in plain English

---

# Assessment design suggestion

A strong SQL platform should assess at multiple levels.

## Level 1 — Recall and syntax confidence
Examples:
- select the right columns
- filter correctly
- sort correctly

## Level 2 — Analytical reasoning
Examples:
- group by category
- compare revenue by channel
- find low-performing cohorts

## Level 3 — Multi-step logic
Examples:
- subqueries
- CTEs
- joins with conditions
- derived metrics

## Level 4 — Expert interpretation
Examples:
- explain the grain of a query
- justify why a LEFT JOIN is required
- discuss why a metric may mislead stakeholders

---

# Recommended capstone project tracks

## Track A — Retail analytics
Possible questions:
- Which channel drives the highest revenue?
- Which customers are above average in revenue?
- Which product category performs best?

## Track B — EdTech analytics
Possible questions:
- Which learners are at support risk?
- Which course has the best average score?
- What attendance patterns predict weaker performance?

## Track C — Workforce analytics
Possible questions:
- Which departments pay the most?
- Who reports to whom?
- Who ranks highest by salary or performance within each department?

---

# Why this curriculum works

This curriculum is strong because it is:
- cumulative
- beginner-safe
- practical
- aligned with real data work
- connected to business, education, and data science use cases
- suitable for portfolio building

It does not teach syntax in isolation.
It teaches **thinking with SQL**.
