Excel & Power BI for Finance: Reading a GL Extract: Literacy Before Formulas
Video: Reading a GL Extract: Literacy Before Formulas | Excel & Power BI S1 Ep2 by CelesteAI
Watch full page →Download the AtlasParts dataset — follow along with the same
gl_journal.csvwe read in this episode: github.com/GoCelesteAI/excel-powerbi-for-finance. The shared dataset for the entire series.
You open the Excel file your ERP just produced. There are twenty-two thousand rows.
Each row has a date. A cryptic four-digit number called an account. Two columns of dollar amounts side by side called Debit and Credit, but exactly one of them is zero on every line. A description that’s terse to the point of unhelpful, and a reference to a source document with a code like SO-01284 or REC-RENT-04.
Where do you start?
Most “Excel for finance” tutorials skip this question. They jump straight into SUMIFS and pivot tables, assuming you already know what you’re looking at. But the half-step that actually separates a junior analyst from someone who knows what they’re doing is reading the file — being able to look at any row, any journal entry, and tell a coherent story about what happened in the business.
That’s what this episode is. No formulas. No pivots. Just learning to read.
The shape of a single row
Every row in the AtlasParts general ledger has nine columns. They are, in order:
- journal_id — a label like
JE-000125that ties this row to a journal entry. A single journal entry usually spans two to six rows. - line_id — a unique label like
GL-000247for this specific row. You’ll rarely use it, but it exists so every line is uniquely addressable. - posting_date — when the entry hit the books. Note that this is not always the same as the underlying business event; an invoice dated December 31 may post on January 2.
- account_id — the four-digit code from the chart of accounts.
1010means Cash – Operating.4020means Sales – Engine. We’ll come back to these ranges. - debit — a dollar amount. Either zero, or a positive number.
- credit — a dollar amount. Either zero, or a positive number.
- description — a short text label. The terseness is intentional. ERP systems don’t write essays.
- source_doc — a reference like
SO-01284(sales order),INV-00037(vendor invoice), orREC-RENT-04(recurring entry, April rent). This is the breadcrumb that lets you trace the GL line back to the original transaction. - cost_center — sometimes filled, sometimes blank. Used when a company tracks expenses by department or location.
Two things to notice. First, on any given row, exactly one of debit or credit is zero. The amount is either a debit or a credit. Never both. Second, the rows are flat. There’s no nested structure, no parent-child relationship between rows. Every row is independent.
This flatness is what makes a GL extract simultaneously approachable and overwhelming. Approachable, because every row is the same shape. Overwhelming, because there are twenty-two thousand of them and they look like cryptic noise until you learn to read.
Double-entry: the secret of the GL
Here’s the rule that explains everything: every journal entry’s debits and credits sum to zero.
Not just the file as a whole — though that’s also true. Every individual journal entry. If you filter the file to just journal_id = JE-000125 and add up the debits and the credits, the totals will match exactly.
This is double-entry bookkeeping, and it’s the same idea Italian merchants invented in the fifteenth century. Every business event has two sides. If cash went out, something else came in: an asset, an expense, a settled debt. If revenue was earned, something else was given: a service, a product, an obligation taken on. The GL records both sides. The two sides balance. Always.
When you look at a GL extract, the unit of meaning is not the row. It’s the journal entry — the group of rows that share a journal_id. That’s the unit you read. A single row by itself is half a sentence.
A worked example: the opening balances
Let’s read a real journal entry from the AtlasParts file. The first one in fiscal year 2025 is JE-OPEN-2025, posted on January 1, and it’s the company’s opening balance sheet. There are about two dozen rows in this single entry.
Pick out a handful:
| journal_id | account | debit | credit | description |
|---|---|---|---|---|
| JE-OPEN-2025 | 1010 | $850,000 | $0 | Opening Cash – Operating |
| JE-OPEN-2025 | 1100 | $1,250,000 | $0 | Opening Accounts Receivable |
| JE-OPEN-2025 | 1210 | $480,000 | $0 | Opening Inventory – Brakes |
| JE-OPEN-2025 | 2000 | $0 | $685,000 | Opening Accounts Payable |
| JE-OPEN-2025 | 3000 | $0 | $1,000,000 | Opening Common Stock |
| JE-OPEN-2025 | 3100 | $0 | $? | Opening Retained Earnings |
The story this entry tells: AtlasParts came into 2025 with $850K in operating cash, $1.25M in customer invoices not yet collected, $480K worth of brake inventory in the warehouse, and $1.85M in similar stuff (engine, electrical, suspension, body, accessories inventory not shown above). It owed $685K to its suppliers and had received $1M in shareholder investment over its history. The remainder — what the company has earned and kept over its prior life — falls into Retained Earnings, that last line, which makes the whole thing balance.
If you sum every debit in this journal entry and every credit, the totals are identical. They have to be, by the rules of double-entry. The opening balance entry is just a snapshot of what the company owns, what it owes, and the difference — equity — packaged as one big balanced journal entry.
Reading this entry teaches you most of what you need to know about reading any other entry. The mechanics are always the same: a group of rows, each tied to an account, each contributing either a debit or a credit, totalling to zero.
A worked example: a sales transaction
Now let’s look at something from the middle of the year. JE-000125 posted on November 4. Customer CUST-007 placed sales order SO-01284 for $18,420 worth of engine and suspension parts. Here’s the journal entry:
| journal_id | account | debit | credit | description |
|---|---|---|---|---|
| JE-000125 | 1100 | $18,420 | $0 | Invoice SO-01284 to CUST-007 |
| JE-000125 | 4020 | $0 | $11,200 | Sales Engine – SO-01284 |
| JE-000125 | 4040 | $0 | $7,220 | Sales Suspension – SO-01284 |
Three rows. Sum of debits: $18,420. Sum of credits: $11,200 + $7,220 = $18,420. Balanced.
The story: AtlasParts shipped engine parts worth $11,200 and suspension parts worth $7,220 to customer CUST-007, billed them via sales order SO-01284, and recorded the resulting Accounts Receivable on account 1100. The customer hasn’t paid yet — when they do, that’s a separate journal entry that debits Cash and credits Accounts Receivable for the same amount.
But there’s a second journal entry that posts at the same time, to record the cost side. Look at JE-000126:
| journal_id | account | debit | credit | description |
|---|---|---|---|---|
| JE-000126 | 5020 | $7,460 | $0 | COGS Engine – SO-01284 |
| JE-000126 | 1220 | $0 | $7,460 | Inventory release Engine |
Now AtlasParts has reduced its engine inventory (account 1220) by $7,460 — the cost of what it shipped — and recognized that same $7,460 as Cost of Goods Sold (account 5020). This is the cost-side mirror to the revenue entry. The same physical event — a shipment going out — produces two journal entries: one for revenue, one for cost.
Reading the GL means seeing both. The pair of entries together is what a finance pro means when they say “this sale”. Take one without the other and you have an incomplete picture.
The chart of accounts — what those numbers mean
You’ve probably noticed: every account has a four-digit ID. They’re not random. The first digit tells you what kind of account it is:
- 1xxx — Assets. Things AtlasParts owns. Cash (1010), Accounts Receivable (1100), Inventory (1210–1260), Equipment (1510), and so on.
- 2xxx — Liabilities. Things AtlasParts owes. Accounts Payable (2000), Accrued Wages (2110), Long-term debt (2400).
- 3xxx — Equity. What’s left over after liabilities — the shareholders’ claim. Common Stock (3000), Retained Earnings (3100).
- 4xxx — Revenue. Money earned. Product Sales (4000s, broken down by category).
- 5xxx — Cost of Goods Sold. The direct cost of what was sold.
- 6xxx and 7xxx — Operating Expenses. Salaries, rent, utilities, marketing, depreciation.
- 8xxx and 9xxx — Other items. Interest, taxes, miscellaneous.
This numbering convention is not a law of physics. Every ERP system implements it slightly differently. Some companies use five or six digits instead of four. SAP famously uses the natural account plus a cost element, and Oracle uses a flexfield. But the shape is always the same: assets at the start, then liabilities, then equity, then income statement accounts. Anywhere in the world, in any industry, the chart of accounts has this structure. Once you’ve seen it once, you can read any of them.
There’s also hierarchy. Account 1200 is Inventory in general. Account 1210 is Inventory – Brakes specifically. The relationship is recorded in a separate parent_account_id column on the chart of accounts master file — not on the GL itself. The GL just records the leaf account; you join to the chart of accounts to discover the parent grouping. This will become important in Episode 3, when we use lookups to roll up GL data into a P&L by category.
The trial balance — proof the file is honest
If every journal entry’s debits equal its credits, and every journal entry is in the file, then the whole file must balance: total debits across all 22,000 rows must equal total credits.
This is the trial balance. It is the most basic sanity check in accounting. If a GL extract doesn’t trial-balance — if total debits and total credits differ by even a cent — something is wrong. Either the export is incomplete, the file is corrupt, or someone tampered with it.
In Excel, you can confirm this by selecting the entire debit column, looking at the SUM in the status bar, and comparing to the same on the credit column. For AtlasParts’ fiscal year 2025, both columns sum to roughly $33.1 million, and the difference is exactly zero. The file is honest.
This is also a useful trick for catching data quality problems early. If you’re given a GL extract and the trial balance is off, stop. Do not proceed with analysis until you understand why. The first question to ask is whether the export filtered out any rows — by date, by company code, by some criterion someone forgot to mention.
Source documents — the breadcrumbs back
Look back at the sales journal entries. The source_doc column on each row says SO-01284. That’s not arbitrary. It’s the sales order number, the original document that triggered the journal entry.
Source documents are how the GL connects to the rest of the business. If a CFO points at one row in the GL and asks “what is this,” the answer is in the source document. Pull up SO-01284 in the ERP and you’ll see who the customer was, what was shipped, when, and at what price. The GL captured the financial impact; the source document captured the operational reality.
In the AtlasParts file, source documents fall into a few patterns. SO-XXXXX is a sales order. INV-XXXXX is an AP invoice. REC-PAY-MM is a recurring payroll entry for month MM. REC-DEP-MM is monthly depreciation. JE-OPEN-2025 is the opening balance entry. Once you know the patterns, you can scan a thousand rows of GL and immediately tell which are sales, which are vendor payments, which are monthly accruals.
This is also the foundation of audit trails. When an auditor asks “show me the supporting documentation for this revenue figure,” the GL extract’s source-doc column tells them which sales orders to pull. No source document, no audit defense.
What you can do now
You can’t yet build a P&L. You can’t roll up sales by region or compute AP aging. Those are the next four episodes.
But you can do things that, half an hour ago, you couldn’t. You can open any row in the AtlasParts GL and tell a story about what happened: what kind of account was hit, what business event triggered it, where to look for more detail. You can identify a balanced journal entry. You can spot a sales transaction versus a recurring monthly entry. You can recognise an opening balance entry by its date and its source-doc tag.
You can read the file.
That is the foundation everything else in this season is built on. From here, the formulas and pivots are mechanical. The question they answer — “how much did we make on engine parts in Q4” — is what you can finally pose now that you know what you’re looking at.
Coming up next
Episode 3: Lookups for Finance. We’ve been talking about joining the GL to the chart of accounts and the customer master. Now we actually do it, with SUMIFS, XLOOKUP, and INDEX/MATCH. You’ll go from twenty-two thousand rows of journal lines to a one-page P&L by product category and customer country.
See you there.