Window Functions: When Aggregates Shouldn't Collapse Your Rows | SQL for Finance Ep6

0views
C
CelesteAI
Description
AtlasParts dataset and atlasparts.db: https://github.com/GoCelesteAI/excel-powerbi-for-finance Source code: https://github.com/GoCelesteAI/sql-for-finance Episode 6 of SQL for Finance. Window functions — the OVER clause, PARTITION BY, ORDER BY, LAG, LEAD, ROW_NUMBER, RANK, DENSE_RANK — and the 12-month revenue dashboard you can build in one query, no helper columns. By Episode 5 you can stage any reporting query inside a WITH clause and read it top-to-bottom. That covers most of the queries a finance team writes — but not all of them. The moment you need a running total, a month-over-month change, a rank, or each customer's share of their region, GROUP BY runs out of room. GROUP BY collapses rows. One row per group, in. One row per group, out. Window functions let you compute aggregates without collapsing — keep every row, attach the group's total alongside. What You'll Learn: - The collapse problem — why GROUP BY can't answer running-total or prior-period questions - The OVER clause — aggregate without collapsing rows; SUM(rev) OVER () attaches the year total to every row - PARTITION BY — split the window into independent groups; same SUM, different denominators per region - ORDER BY inside OVER — running totals, year-to-date, LAG, and LEAD - ROW_NUMBER vs RANK vs DENSE_RANK — what they do on a tie, when to use which - The top-N-per-group pattern — top 3 customers per region in one query - A 12-month revenue dashboard with prior month, change, running total, and percent of year — all in one SELECT - A note on the frame clause — ROWS BETWEEN for rolling averages Timestamps: 0:00 - Intro - Window Functions 0:40 - The collapse problem - GROUP BY vs window functions 1:30 - SUM OVER () - share of year demo 3:25 - The OVER clause anatomy - three shapes 4:15 - LAG, running totals, the 12-month dashboard 6:50 - ROW_NUMBER, RANK, DENSE_RANK + top-N-per-group 9:15 - When to reach for windows - the four signals 9:55 - End screen Key Takeaways: 1. GROUP BY collapses rows. One row per group, in. One row per group, out. That's the right tool for a flat summary, but it's the wrong tool when you need both the per-row detail AND a group-level aggregate. Window functions fill that gap. Same aggregate, OVER, then a window definition. The aggregate computes against the window but stays attached to each row. 2. The OVER clause has three shapes. Empty parens means the whole result set is the window. PARTITION BY a column means the window splits into independent groups, one per value. ORDER BY a column inside OVER means the window becomes everything from the start through the current row, which gives you running totals for free. 3. LAG and LEAD are the row-to-row operators. LAG of revenue OVER ORDER BY month gives you the previous month's revenue attached to each row. Subtract them and you have month-over-month change. LEAD looks forward instead. Together these are how finance does prior-period reporting in one query, no self-joins. 4. ROW_NUMBER, RANK, and DENSE_RANK all rank rows but differ on ties. ROW_NUMBER assigns unique numbers (arbitrary tiebreak). RANK gives ties the same rank and skips the next number. DENSE_RANK gives ties the same rank but doesn't skip. In the AtlasParts product catalog, two SKUs both sold 581 units — pick the function based on whether ties should truly tie. 5. The top-N-per-group pattern is the killer one. ROW_NUMBER OVER PARTITION BY region ORDER BY revenue DESC restarts the rank for each region; the outer WHERE rn less than or equal to 3 keeps just the top 3 per region. This is the query most finance teams discover late — once you have it, dashboards stop needing helper columns. #SQL #SQLite #FinanceAnalytics #DataAnalytics #SQLforFinance #WindowFunctions #OVER #PARTITIONBY #LAG #RANK #FinancialReporting #ERPdata #Database #LearnSQL --- Generated by GoCelesteAI · part of the SQL for Finance series