JOINs in Depth: Putting the Tables Back Together | SQL for Finance Ep4
0views
C
CelesteAI
Description
๐ AtlasParts dataset and atlasparts.db:
https://github.com/GoCelesteAI/excel-powerbi-for-finance
Source code: https://github.com/GoCelesteAI/sql-for-finance
Episode 4 of SQL for Finance. JOINs in depth โ and the one join bug that ships finance dashboards 4ร wrong.
Every query in this series so far has lived inside one table. Real finance work doesn't. The order amount is in sales_orders, the customer name is in customers, the line-item product is in sales_order_lines, the GL account label is in chart_of_accounts. Joins exist to put them back together. This episode walks through the three joins you'll actually use โ INNER, LEFT, RIGHT โ and then spends the second half on the single most common bug an analyst writes against a real database: the join that silently quadruples your revenue number.
What You'll Learn:
- The AtlasParts schema as an ER diagram โ facts vs dimensions, and the six FK relationships that connect them.
- INNER JOIN โ keep matched rows. The bare JOIN keyword is identical.
- LEFT JOIN โ keep every left-hand row; reveal what didn't match with WHERE right.key IS NULL.
- RIGHT JOIN โ exists, almost no one writes it; flip the table order and use LEFT.
- The wrong-cardinality trap โ joining a fact to a one-to-many child table changes the grain. SUM at the wrong grain inflates totals silently. AtlasParts demo: $11.5M becomes $50M.
- The 30-second join-debug โ COUNT(*) before SUM(...). Always.
- Aggregating at the right grain โ sum line-level columns at the line table; never drag order-level columns into a line-grain query.
Timestamps:
0:00 - Intro โ JOINs in depth
0:40 - The AtlasParts schema (ER diagram)
1:40 - INNER JOIN โ orders ร customers
2:50 - LEFT JOIN โ find what didn't match (19 dropped customers)
4:20 - The wrong-cardinality trap โ $11.5M becomes $50M
6:00 - Fix the grain โ aggregate at the line level
7:15 - Recap and the 30-second debug
8:00 - End screen
Key Takeaways:
1. Joins exist to bring labels next to numbers. Fact tables hold the numbers (sales_orders, gl_journal, ap_invoices). Dimension tables hold the labels (customers, products, vendors, chart_of_accounts, countries). Every report you build at work stitches at least one fact and one dimension together.
2. INNER JOIN drops every row that doesn't match on both sides. That's usually what you want, until it isn't. When a report's totals look low, switch the join to LEFT, filter WHERE right_table.key IS NULL, and look at what got silently discarded. Nineteen of fifty AtlasParts customers fall out of an inner join to the countries table because somebody typed USA instead of US.
3. RIGHT JOIN exists but is rarely written. Anything you can write as A LEFT JOIN B you can write as B RIGHT JOIN A. Reading code is easier when the anchor table is on the left and the lookup is on the right, so default to LEFT every time.
4. The wrong-cardinality trap is the join bug that ships dashboards 4ร wrong. Joining sales_orders to sales_order_lines (which is one-to-many) duplicates each order by its line count. SUM(order_total) after that join inflates the total by the average line count per order. AtlasParts proves it: 11.5 million becomes 50 million. The query runs, the labels look right, the number is wrong.
5. Always COUNT before you SUM. If your join's row count exceeds the left table's row count, your join is one-to-many. The cost of the check is one extra query. The benefit is never publishing a 4ร wrong number to a CFO. The fix is to aggregate at the right grain โ if you want revenue by category, sum line_total from sales_order_lines, not order_total via a join through sales_orders.
#SQL #SQLite #FinanceAnalytics #DataAnalytics #SQLforFinance #JOIN #InnerJoin #LeftJoin #FinancialReporting #ERPdata #Database #LearnSQL
---
Generated by GoCelesteAI ยท part of the SQL for Finance series