Subqueries and CTEs: Stop Writing One Giant Query | SQL for Finance Ep5
0views
C
CelesteAI
Description
AtlasParts dataset and atlasparts.db:
https://github.com/GoCelesteAI/excel-powerbi-for-finance
Source code: https://github.com/GoCelesteAI/sql-for-finance
Episode 5 of SQL for Finance. Subqueries and CTEs — and the four-stage monthly P and L pipeline that mirrors what your finance team actually maintains in production.
By Episode 4 you can stitch any two AtlasParts tables together, aggregate at the right grain, and avoid the cardinality traps. That covers most one-shot reporting questions. But the moment a question has two parts — show me customers above the average, or build me a monthly P and L — a single SELECT runs out of room. This episode is about how you break a hard question into stages. There are two tools for that: subqueries (a query inside another query), and CTEs (a WITH clause that names each stage). Both compute the same things. CTEs make the code readable; subqueries make it compact.
What You'll Learn:
- The two-step problem — why one SELECT can't answer questions like "above the average"
- Subqueries — a complete SELECT dropped inside another query, returning one value or one table
- The three positions a subquery can sit in: WHERE (one number), FROM (a table), SELECT (one extra value per row)
- CTEs — WITH name AS (SELECT ...), the named, top-to-bottom stage syntax
- A four-stage monthly P and L pipeline — revenue, COGS, OpEx, net profit — built from gl_journal in one CTE
- The decision rule: when to reach for a subquery vs a CTE
- Why nesting more than 2 levels deep means it's time to refactor
Timestamps:
0:00 - Intro — Subqueries and CTEs
0:40 - The two-step problem
1:30 - Above-average customers (subquery in HAVING)
3:00 - Where subqueries can sit (WHERE / FROM / SELECT)
3:40 - Refactor to a CTE — same answer, top-to-bottom
5:15 - The centerpiece — monthly P and L pipeline (revenue, COGS, OpEx, net)
7:35 - When to use which — decision rules
8:30 - End screen
Key Takeaways:
1. The hardest gear-shift moving from Excel to SQL is realizing that one query can answer questions that feel like two steps. In Excel you put the average in cell B1 and filter against it; in SQL you can't reference a "previous result" — but you can drop a subquery inside another query. The inner SELECT computes the value, the outer SELECT uses it.
2. Subqueries can sit in three places. In the WHERE clause, returning one number to filter against. In the FROM clause, returning a derived table you can join or aggregate. And in the SELECT list, returning one extra computed column per row of the outer query. Same SQL feature, three positions.
3. CTEs (Common Table Expressions) are the WITH-clause syntax for the same idea — but with names. Once you write WITH per_customer AS (SELECT ...), you can reference per_customer like a table for the rest of the query. CTEs read top-to-bottom; subqueries read inside-out. For multi-stage pipelines, CTEs are the only sane choice.
4. The centerpiece query in this episode is a real four-stage monthly P and L. One CTE buckets every gl_journal entry into revenue, COGS, or OpEx using SUM(CASE WHEN ...) on account_type and account_id. The outer SELECT computes gross_profit and net_profit from those columns. The result is a 12-month income statement that tells a story — AtlasParts loses money in January, breaks even in March, and grows to a 188 thousand profit by November.
5. When to reach for which: subquery if the inner query returns one number used in one place and the whole thing fits on a screen. CTE if a stage is reused, the query has more than two stages, or anyone other than you will read the code. The hard rule: nest a subquery more than two levels deep, stop and rewrite as CTEs. Future-you will thank you.
#SQL #SQLite #FinanceAnalytics #DataAnalytics #SQLforFinance #CTE #Subquery #WITHclause #FinancialReporting #ProfitAndLoss #ERPdata #Database #LearnSQL
---
Generated by GoCelesteAI · part of the SQL for Finance series