SQL for Finance: Filter, Sort, Pick: WHERE, ORDER BY, LIMIT in SQLite
Video: Filter, Sort, Pick: WHERE, ORDER BY, LIMIT in SQLite | SQL for Finance Ep2 by CelesteAI
Download the AtlasParts dataset and
atlasparts.db— the same eight CSVs we used across the Excel & Power BI for Finance series, packaged as a SQLite database file: github.com/GoCelesteAI/excel-powerbi-for-finance. If you set this up in Episode 1, you’re ready to go.
The CFO never asks “show me everything.” She asks for a slice. The biggest twenty orders. Open invoices over thirty days old. Customers in the US, sorted by credit limit. Each of those is the same SQL recipe — three clauses that turn SELECT * into the answer she actually wants.
In this episode we’ll work through WHERE, ORDER BY, and LIMIT. Together they do most of the work in any finance query, and they do it on the same atlasparts.db you set up last time.
The shape of a query with all four clauses
Every query you’ll write for the rest of the series follows the same skeleton:
SELECT columns_you_want
FROM table_name
WHERE conditions
ORDER BY column
LIMIT number;
Read top to bottom and it’s almost a sentence. Select these columns, from this table, where these things are true, ordered by this, give me this many rows. The clauses always go in that order. SQL won’t let you put WHERE after ORDER BY even if you’re reading the words backwards in your head.
WHERE: pick the rows you care about
The simplest WHERE filter matches one column to one value:
SELECT customer_name, country, region
FROM customers
WHERE country = 'US';
Run that against AtlasParts and you get back twenty-five rows. That’s a real, finished answer to “give me the US customers.”
Now for a wrinkle that catches every finance analyst the first week. The customers table has fifty rows total, and your gut says about thirty are based in the US. Try this:
SELECT DISTINCT country FROM customers ORDER BY country;
You’ll see four spellings of the same country: US, USA, us, and United States of America. WHERE is exact. The query above matches only the literal string 'US'. The other five customers slip through because their country field has a different spelling.
This is the first finance lesson of the episode. WHERE clauses tell you exactly what you asked, not what you meant. When your number doesn’t match the spreadsheet, this is usually why. Real ERP exports have spelling drift, casing drift, leading spaces, and at least one row where someone typed the country in Spanish.
More than just equals
WHERE supports the comparison operators you’d expect, plus a handful that finance work leans on heavily:
-- numeric comparisons
WHERE order_total > 10000
WHERE order_total BETWEEN 5000 AND 10000
-- text matching
WHERE country IN ('US', 'USA', 'Canada')
WHERE customer_name LIKE 'A%'
-- combinations
WHERE country = 'US' AND status = 'Open'
WHERE status = 'Shipped' OR status = 'Closed'
IN is the fix for our country problem. LIKE 'A%' matches any customer name starting with the letter A. BETWEEN is inclusive on both ends. AND and OR combine conditions, and parentheses control the order when both appear.
Let’s run a real one:
SELECT order_id, order_date, customer_id, order_total
FROM sales_orders
WHERE CAST(order_total AS REAL) > 10000;
You get back 292 rows — out of 2,242 total orders. That’s the slice your treasurer cares about. The long tail of $50 spare-part orders is irrelevant for the question.
Why CAST? Our data was loaded from CSV, so SQLite stored every column as TEXT. Comparing text with
>works lexicographically, not numerically — under text rules,'9000' > '10000'is true, because the character'9'comes after'1'. CAST forces the comparison to be numeric. When you’re filtering or summing numbers that were loaded from CSV, CAST. Episode 7 covers data types properly; for now, just know to do it.
ORDER BY: put the important rows on top
Filtering gives you the right rows. Sorting brings the most important ones to the front:
SELECT order_id, customer_id, order_total
FROM sales_orders
WHERE CAST(order_total AS REAL) > 10000
ORDER BY CAST(order_total AS REAL) DESC;
The biggest order in AtlasParts is Silverline Service Center’s order SO-00577 for $29,906.06, and now it’s the first row.
ASC is ascending and is the default. DESC is descending. You almost always want DESC for amounts — biggest first.
ORDER BY can sort on multiple columns, applied left to right:
ORDER BY status, CAST(order_total AS REAL) DESC
That groups the rows by status alphabetically (Closed, Open, Shipped), and within each status the biggest order comes first. The ordering is layered, like a pivot table’s row hierarchy.
LIMIT: just give me the top twenty
LIMIT trims the result set to the first N rows. Pair it with ORDER BY and you have the top-N pattern that finance reports run on:
SELECT order_id, customer_id, order_total
FROM sales_orders
ORDER BY CAST(order_total AS REAL) DESC
LIMIT 20;
Top twenty orders, biggest first. That’s the whole query. In Excel, this would be sort the range, copy the top twenty rows, paste into the deck — and re-do the whole thing tomorrow when the data refreshes. In SQL, it’s a query you save once and re-run forever.
LIMIT without ORDER BY is a trap. SQLite is allowed to give you any twenty rows, in any order. They might be the first twenty in the file, or the last, or a mix. Always pair LIMIT with ORDER BY so the result is reproducible.
Four gotchas worth memorizing
These are the surprises that catch finance analysts in their first week of SQL:
= NULLdoesn’t work.NULLmeans unknown, and unknown is never equal to anything — not even another unknown. To test for missing values useIS NULLandIS NOT NULL. WritingWHERE close_date = NULLreturns zero rows even when half your data is missing close dates.- String comparisons are case-sensitive by default.
WHERE country = 'us'returns one row in our customers table, not twenty-five. If you want case-insensitive matching, useLOWER(country) = 'us', orcountry = 'US' COLLATE NOCASE. - CSV-loaded numbers are TEXT. Until Episode 7 teaches us how to fix the schema at load time, every numeric comparison and every SUM needs a CAST. It’s annoying. It’s also the price of one-line CSV imports.
LIMITwithoutORDER BYis non-deterministic. Always sort before you trim.
None of these is hard to fix once you know about it. The trap is finding out at 5pm on a Thursday when the number you sent the CFO turns out to be wrong.
What’s next
Episode 3 is the pivot-table replacement: aggregations and GROUP BY. We’ll move from “show me the orders over $10K” to “show me total revenue by region by quarter, and rank the regions.” Same dataset, same syntax, one more clause.
Code
The four queries from this episode are in atlasparts/sql/ep02_filter_sort_pick.sql on the repo. Run them with sqlite3 atlasparts.db < ep02_filter_sort_pick.sql.
-- 1. Customers in the US (the way you'd write it first)
SELECT customer_name, country, region
FROM customers
WHERE country = 'US';
-- 2. The corrected version that handles spelling drift
SELECT customer_name, country, region
FROM customers
WHERE country IN ('US', 'USA', 'us', 'United States of America');
-- 3. Sales orders over $10K
SELECT order_id, order_date, customer_id, order_total
FROM sales_orders
WHERE CAST(order_total AS REAL) > 10000;
-- 4. Top 20 orders, biggest first
SELECT order_id, customer_id, order_total
FROM sales_orders
ORDER BY CAST(order_total AS REAL) DESC
LIMIT 20;
See you in Episode 3.