Filter, Sort, Pick: WHERE, ORDER BY, LIMIT in SQLite | SQL for Finance Ep2

0views
C
CelesteAI
Description
๐Ÿ“‚ AtlasParts dataset and atlasparts.db: https://github.com/GoCelesteAI/excel-powerbi-for-finance Source code: https://github.com/GoCelesteAI/sql-for-finance Episode 2 of SQL for Finance. Three clauses that turn SELECT * into the slice your CFO actually asked for. The CFO never asks "show me everything." She asks for the biggest twenty orders. Open invoices over thirty days old. Customers in the US sorted by credit limit. Each of those is the same SQL recipe โ€” WHERE filters rows, ORDER BY sorts them, LIMIT keeps the top N. Together they handle most of the work in any finance query, and they do it on the same atlasparts.db you set up in Episode 1. What You'll Learn: - The five-clause query skeleton (SELECT / FROM / WHERE / ORDER BY / LIMIT) and why the order matters. - WHERE with comparison operators (greater than, less than), BETWEEN, IN, LIKE, and combining conditions with AND / OR. - The spelling-drift gotcha: WHERE country = 'US' returns 25 rows but the data has 30 actual US customers (US / USA / us / United States of America). - Why CSV-loaded numbers need CAST(col AS REAL) for any numeric comparison or sum. - ORDER BY ascending vs descending and multi-column sort (status, then amount DESC). - The top-N pattern: ORDER BY x DESC LIMIT 20, and why LIMIT without ORDER BY is a trap. - Four gotchas worth memorizing: = NULL, case sensitivity, CSV-as-TEXT, and unsorted LIMIT. Timestamps: 0:00 - Intro โ€” the CFO never asks "show me everything" 0:35 - The five-clause query skeleton 1:15 - WHERE basics โ€” and the spelling-drift trap 2:35 - WHERE operators โ€” orders over $10K (CAST trap) 3:50 - ORDER BY โ€” sorting and multi-column hierarchies 4:55 - LIMIT โ€” the top-N pattern 5:45 - Four gotchas worth memorizing 7:05 - Recap and what's next 7:45 - End screen Key Takeaways: 1. The five clauses always go in the same order. SELECT, FROM, WHERE, ORDER BY, LIMIT. Read top to bottom and it's almost a sentence โ€” select these columns, from this table, where these things are true, sorted this way, this many rows. SQL won't let you put WHERE after ORDER BY even when you write the words in a different order in your head. Once the skeleton is in your fingers, every query is a variation on the same shape. 2. WHERE tells you exactly what you asked, not what you meant. Filtering customers by country = 'US' returns 25 rows in our data โ€” but the table actually has 30 US-based customers, because the country field has spelling drift (US, USA, us, United States of America). When your number doesn't match the spreadsheet, this is usually why. Real ERP exports have casing drift, leading spaces, and at least one row in Spanish. 3. CSV-loaded columns are TEXT until you fix the schema. Comparing the string '9000' to the string '10000' under text rules says '9000' is bigger (because the character '9' sorts after '1'). Until Episode 7 covers data types properly, every numeric WHERE, ORDER BY, or SUM against a CSV-loaded column needs CAST(col AS REAL). It's verbose and it's the price of one-line CSV imports. 4. The top-N pattern is the most reusable query in finance reporting. SELECT ... ORDER BY amount DESC LIMIT 20 gives you the top twenty of anything โ€” orders, vendors, customers, GL accounts โ€” in one query. In Excel that workflow is sort the range, copy twenty rows, paste into the deck, repeat tomorrow when the data refreshes. In SQL it's a query you save once and re-run forever. 5. LIMIT without ORDER BY is non-deterministic. SQLite is allowed to give you any twenty rows in any order, and which twenty might change between runs. Always pair LIMIT with ORDER BY so the result is reproducible โ€” otherwise tomorrow's report could disagree with today's even though the data didn't change. #SQL #SQLite #FinanceAnalytics #DataAnalytics #SQLforFinance #BeginnerSQL #FinancialReporting #ERPdata #Database #LearnSQL --- Generated by GoCelesteAI ยท part of the SQL for Finance series