Part of Excel & Power BI for Finance: Power BI (Season 2)

Excel & Power BI for Finance: Star Schema for Finance: Fact, Dimensions & the Calendar Table

Celest KimCelest Kim

Video: Star Schema for Finance: Fact, Dimensions & the Calendar Table | Excel & Power BI S2 Ep3 by CelesteAI

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

Download the AtlasParts dataset and the Episode 9 Power BI starter file — follow along in your own copy of Power BI Desktop: github.com/GoCelesteAI/excel-powerbi-for-finance.

Episode 8 ended with eight Tables sitting in the Vertipaq model, none of them connected. The Model view canvas was a row of disconnected cards — customers, vendors, products, chart_of_accounts, gl_journal, ap_invoices, sales_orders, sales_order_lines. We loaded the data. We didn’t yet model it.

This episode is about the modelling. By the end, AtlasParts will have a proper data model — one fact Table at the centre, dimension Tables around it, a Calendar Table we’ll build from scratch, and clean one-to-many relationships drawn between them. Then DAX has somewhere sensible to land.

Why your GL extract isn’t enough

The first instinct most finance people have when they move to Power BI is to load one big wide Table and write measures against it. The GL extract feels like the natural primary table — every transaction is in there, with customer_id, product_id, account_code, posting_date, debit_amount, credit_amount all on the same row.

Why not just measure off that? Three reasons.

Filter granularity. When the regional VP filters by region, that filter has to find its way to the GL rows. If region lives only on the customer master, and the GL only has customer_id, Power BI needs a relationship between the two Tables to propagate the filter. No relationship, no filter — the visual silently shows the unfiltered total.

Aggregation correctness. Some metrics are additive across rows (revenue), some are not (account balances, headcount). DAX needs to know which Tables to aggregate over and which Tables to use only as filter sources. The star schema makes that distinction structural rather than something you have to remember in every measure.

Model size and performance. Vertipaq compresses each column independently. A wide Table with denormalised dimension data — every GL row carrying customer_name, customer_country, customer_region, product_name, product_category — explodes the per-column cardinality and ruins compression. A normalised star, with each dimension in its own small Table, compresses ten to fifty times better.

The GL extract is the raw material. The model we build on top of it is the structure. Confusing the two is the first mistake every Power BI newcomer makes.

Fact and dimension Tables

A star schema has exactly two kinds of Tables.

Fact Tables hold the things that happened. A row in a fact Table is a transaction, a posting, an event. For AtlasParts, gl_journal is the obvious fact Table — every row is one journal posting with an amount, a date, and references (foreign keys) to the relevant dimensions.

Fact rows have two kinds of columns: measures (the numbers you sum, average, etc — debit_amount, credit_amount) and foreign keys (the references — customer_id, product_id, account_code, posting_date). That’s it. Anything else — a customer name, a product description, an account category — does not belong on a fact row. It belongs on its dimension.

Dimension Tables hold the descriptions. A row in a dimension Table is one thing — one customer, one product, one account, one date. Each dimension row has a primary key (the value that fact rows reference) and descriptive attributes (name, category, region, hierarchy levels).

For AtlasParts, the dimensions are: customers (one row per customer, key = customer_id), products (one row per product, key = product_id), chart_of_accounts (one row per GL account, key = account_code), and the one we’ll build by hand — a calendar Table (one row per date, key = the date itself).

The visual shape that comes out of this — fact in the centre, dimensions arrayed around it like points on a star — is where the name comes from. The metaphor is more than decorative: it tells you the right-shaped model has short arms (every fact row reaches every dimension in one hop) rather than long chains.

What goes where

Walking through AtlasParts column-by-column makes it concrete. Here’s the GL extract:

journal_id, posting_date, posting_period, company_code, account_code, customer_id, vendor_id, product_id, description, debit_amount, credit_amount, currency

Of those columns, only the IDs (journal_id, account_code, customer_id, vendor_id, product_id, posting_date) and the amounts (debit_amount, credit_amount) are fact material. Everything else either moves to a dimension or stays as a degenerate detail on the fact row.

description — keep on the fact, it’s per-transaction and doesn’t repeat. posting_period — derive from posting_date in the calendar dim, drop from fact. company_code — could become its own tiny dimension if you have multiple companies, or stay as a fact column if you only have one. currency — same call: dim if many currencies and you want symbol/format attributes; fact column if it’s just a label.

The decision rule isn’t theoretical. Ask: will I want to filter or group by this? If yes, dimension. Will I sum or average it? If yes, fact (as a measure column). Is it just a label that travels with the row? Fact column, no relationship needed.

The Calendar Table you have to build

There’s one dimension that doesn’t come in any AtlasParts CSV: the date dimension. You have to build it.

Why? Because posting_date in the GL is a column of dates — but DAX time intelligence functions (year-to-date, month-to-date, prior year, fiscal quarter) need a Table where every date in your reporting range exists, with attributes like Year, Quarter, Month, Month Name, Day of Week, IsWeekend, Fiscal Year, Fiscal Quarter. Even dates with no transactions need rows — otherwise YTD breaks on slow days.

In Power BI, you have two ways to build a calendar.

Power Query — Add Column → Custom Column with List.Dates(StartDate, NumberOfDays, #duration(1,0,0,0)) to generate the spine, then expand into Year, Quarter, Month, etc. Refreshes with the rest of the model.

DAXCalendar = CALENDAR(DATE(2020,1,1), DATE(2026,12,31)) creates a calculated Table at model load. Add columns with YEAR(), MONTH(), FORMAT(). Quicker, all in DAX.

For AtlasParts, we use DAX. Three lines:

Calendar = ADDCOLUMNS(
    CALENDAR(DATE(2020,1,1), DATE(2026,12,31)),
    "Year", YEAR([Date]),
    "Quarter", "Q" & FORMAT(QUARTER([Date]), "0"),
    "Month", MONTH([Date]),
    "MonthName", FORMAT([Date], "MMM"),
    "FiscalYear", IF(MONTH([Date]) >= 4, YEAR([Date]) + 1, YEAR([Date])),
    "DayOfWeek", FORMAT([Date], "dddd"),
    "IsWeekend", IF(WEEKDAY([Date], 2) > 5, "Y", "N")
)

Mark it as the Date Table — Modeling tab → Mark as Date Table → pick the Date column. This tells Power BI “this is the calendar dimension; treat it specially for time intelligence.” Skip this step and DAX functions like TOTALYTD and SAMEPERIODLASTYEAR either silently misbehave or refuse to work.

Drawing the relationships

Switch to Model view. The Calendar Table is now there, alongside the eight loaded Tables.

Relationships are drawn by dragging from a fact-side foreign key to a dimension-side primary key. gl_journal[customer_id]customers[customer_id]. gl_journal[product_id]products[product_id]. gl_journal[account_code]chart_of_accounts[account_code]. gl_journal[posting_date]Calendar[Date].

Power BI auto-detects most of these on Close & Apply if column names match — and they do, by design — but you should always double-check. The Properties pane on each relationship shows the cardinality (almost always one-to-many: dimensions are unique, facts repeat) and the filter direction (almost always single: filters flow from dim to fact, not the other way).

Two pitfalls to watch for.

Many-to-many or undefined cardinality is a smell — it usually means one of your dimension Tables has duplicate keys (a customer ID listed twice in customers) and Power BI is being defensive. Find the dupe and remove it. A clean star has exactly one row per key in every dimension.

Both-direction filters are sometimes necessary — when a single fact filters multiple dimensions that need to filter back through the fact — but they create ambiguity in DAX measures. Default to single direction. Only switch to both when a measure visibly needs it and you’ve reasoned about the implications.

After the four relationships are drawn, the Model view canvas now actually looks like a star. gl_journal floats in the centre. customers, products, chart_of_accounts, Calendar orbit it. Lines with arrows and 1 / * markers connect them.

This shape is doing real work. Every measure you write from here on can rely on it.

Sales as a second fact (when you need two)

A subtle point: AtlasParts has gl_journal and also sales_orders + sales_order_lines. The latter two contain transactions that eventually post to GL, but at a different grain — sales is order-level, GL is posting-level — and during the month they exist before they post.

A real finance model would treat sales_order_lines as a second fact Table, related to the same dimensions as GL (customers, products, calendar). A model with multiple fact Tables sharing dimensions is called a galaxy schema — multiple stars with shared centres.

For this episode we’ll stay with one fact (gl_journal) to keep the picture clean. We’ll come back to the galaxy when we build the AtlasParts dashboard in Episode 6.

Snowflakes vs stars

You may have heard of a snowflake schema — like a star but with normalised dimensions (a Customers dim that points to a Countries dim, instead of carrying country attributes inline). Snowflakes look pretty in a database textbook, but in Power BI they’re worse than stars in every measurable way: more relationships, more potential for ambiguity, slower DAX, harder to reason about.

The Power BI rule: denormalise the dimensions. If customer.country.region is three Tables linked, collapse it to one Table — customers with country and region as columns. We did that in Episode 8 with the Merge step that attached region. That work was preparation for this episode’s clean star.

The exception is where a dimension is genuinely huge and would bloat — a product master with millions of variants might warrant snowflaking out a category Table — but for finance models in the small-millions-of-rows range, always denormalise. Star, not snowflake.

What the model looks like now

Switch to Model view. Five Tables are connected: - gl_journal (fact, ~4.1M rows) at the centre - customers (dim, ~487 rows) connected by customer_id - products (dim, ~1,124 rows) connected by product_id - chart_of_accounts (dim, ~412 rows) connected by account_code - Calendar (dim, ~2,557 rows for 7 years) connected by Date

Three Tables are still floating, unconnected: vendors, ap_invoices, sales_orders, sales_order_lines. They have a place in the bigger picture (vendors will connect via vendor_id on a future AP fact, sales is the second-fact path mentioned above), but for now leave them be. A clean four-dimension star is more useful than a complete-but-tangled diagram.

What’s next

Episode 4 is DAX. Now that the model has shape, we can write our first measures — Total Revenue, Gross Margin, YTD Revenue — and watch them respond correctly to slicers, cross-filters, and drill-downs. Without the star schema, every measure would have a footnote (“works only when filtered by…”). With the star, measures behave the way the formula bar suggests they should.

For homework: open the Episode 9 starter file, run the Calendar table DAX block above, mark the Date column, and draw the four relationships. Verify that filtering by customers[region] actually changes the sum of gl_journal[debit_amount] in a card visual. If the card stays put when the slicer moves, you’re missing a relationship — fix it before Episode 4.

Star schema is the single most important concept in Power BI. Get this episode right and the rest of Season 2 will feel inevitable. Get it wrong and every measure you write will fight you.

See you in Episode 4 — DAX, finally.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.