Excel & Power BI for Finance: Power Query at Scale: Loading AtlasParts into Power BI
Video: Power Query at Scale: Loading AtlasParts into Power BI | Excel & Power BI S2 Ep2 by CelesteAI
Download the AtlasParts dataset and the Episode 8 Power BI starter file — follow along in your own copy of Power BI Desktop: github.com/GoCelesteAI/excel-powerbi-for-finance.
Episode 7 was the bridge. We argued for why Power BI exists, walked through the Desktop window, and named the three engines under the hood: Power Query, Vertipaq, DAX.
This episode is about the first of those three. Power Query. The same M language you already used in Season 1 Episode 5 to clean a customer master. The same Advanced Editor. The same Applied Steps. Same engine.
What’s different is what it feeds. In Excel, Power Query loaded into a worksheet — one table, one tab, ready for a pivot. In Power BI, Power Query loads into a model — many tables at once, each one a citizen of a database that DAX will query later.
The skill transfers one-for-one. The mindset shifts a little. By the end of this episode, the eight AtlasParts CSV files will be loaded as eight Tables in our Power BI model, refresh-clean, parametrised, and ready for the star schema work in Episode 3.
The recap from Season 1
Power Query in Excel did three jobs for us.
It connected to a source. A CSV, a folder, a SharePoint list, a database. The Source step said “here’s where the data lives, go fetch it.”
It transformed the rows. Trim whitespace. Uppercase a country column. Replace “USA” with “US”. Pivot a wide table tall. Each transformation became a step in the Applied Steps panel — a recipe the engine replays every time you click Refresh.
It loaded the result. In Excel, into a worksheet table or directly into the Data Model. The recipe persisted in the workbook; refreshing re-ran every step from Source onward.
That is exactly what Power Query does in Power BI Desktop. The Connect / Transform / Load loop is identical. What changes is that Power BI’s “Load” target is the Vertipaq columnar database we’ll meet in Episode 3 — and the file we save out (the .pbix) holds many queries, not just one.
If Season 1 Episode 5 made sense to you, Power Query in Power BI will make sense to you. We’re standing on the same engine.
Get Data → Folder
Open Power BI Desktop on a fresh file. The first move: Home → Get Data. The dropdown lists the connectors — CSV, Excel, SQL Server, Web, the lot. Pick More… to see the full catalogue.
For AtlasParts, we don’t want to load eight CSVs one at a time. We want one connector that picks up everything in the folder at once. Get Data → File → Folder.
Point it at AtlasParts/. The dialog shows you everything inside — chart_of_accounts.csv, customers.csv, gl_journal.csv, products.csv, vendors.csv, ap_invoices.csv, sales_orders.csv, sales_order_lines.csv. Eight files, all the AtlasParts data.
Click Combine → Combine & Transform. Power BI does something clever here. It looks at the first file, asks you to confirm the format (CSV, comma-delimited, UTF-8), and then generates a function that knows how to parse any file with that shape. It applies the function to every file in the folder and stacks the results.
The Power Query Editor opens with a query named “AtlasParts” and a few helper queries — a sample file, a transform function, a parameter. Don’t be alarmed by the helper queries. They’re the machinery that makes the folder pattern work. Leave them alone.
For our purposes, eight files came in as one combined query. We don’t want one combined query — we want eight separate Tables, one per file, each with its own schema. So we delete the auto-combined query and load each file individually, but we keep the helpers because they’re useful when files share a shape (we’ll come back to that for sales_orders + sales_order_lines).
The lesson here is that Power BI’s “Get Data → Folder” gives you two patterns out of the box: combine (when files share a schema and you want them stacked) and navigate (when files have different schemas and you want them as separate Tables). For AtlasParts we mostly want the second. Right-click each file in the Navigator → Add as New Query. Eight queries, one per file, each ready to transform.
Same transformations, different home
With each AtlasParts file loaded as its own query, the per-file work looks exactly like Season 1.
For customers.csv, we trim whitespace on customer_name, uppercase country, and canonicalise the variants — “USA” / “United States of America” → “US”, “Great Britain” → “UK”, “DE” → “GERMANY”. Same recipe as Episode 5.
For gl_journal.csv, we set the column types (date, decimal, text), trim the description column, and split the posting_period field if anyone has typed it weirdly.
For chart_of_accounts.csv, almost nothing — it’s already clean. Promote headers, set types, done.
The transformations are not new. What’s new is that all eight queries live in the same file and refresh together. When you click Home → Refresh in Power BI Desktop, every query re-runs from its Source step. When the report is published to the Power BI Service and a refresh schedule fires at 6am, every query re-runs again, in order, automatically.
That’s the qualitative leap. In Excel, refresh was a button I clicked. In Power BI, refresh is a service that runs on its own. The recipes are the same; the agency is what changed.
Merge: customers + country list
Two of the AtlasParts files describe related things. customers.csv has a country code per customer. A separate countries.csv (we’ll add it for this episode) maps codes to regions: US → North America, UK → Europe, DE → Europe, JP → Asia Pacific. We want every customer row to carry its region, so the regional VPs can filter on it.
In Power Query, this is a Merge. Home → Merge Queries. Pick customers as the left, countries as the right, click country in both, choose “Left Outer” (keep every customer, attach country fields where they exist), click OK.
A new column appears in customers — a column whose cells say “Table” because each cell holds the matching row from countries. Click the expand icon at the column header, tick region, untick everything else, click OK. The Table-cells become a flat region column.
Now every customer row carries customer_id, customer_name, country, and region. The country list never leaves Power Query — we don’t need to load it into the model. It served its purpose at the merge step and can be set to “Connection only” when we Close & Apply.
This is the database-style lookup, expressed as a Power Query step instead of a VLOOKUP formula. The advantage isn’t the syntax — it’s that the lookup runs once at refresh time, not every recalc. A Power BI report with a hundred merges runs as fast as one with no merges, because the merges happen in Vertipaq’s columnar storage, not in a worksheet recalc loop.
Append: combining historical files
Sometimes the same shape of data lives across multiple files. Suppose AtlasParts had monthly journal exports — gl_journal_2024_01.csv, gl_journal_2024_02.csv, twelve files. Same columns. Different rows. We want them as one big journal Table.
In Power Query, this is Append. Home → Append Queries → As New. Or: drop them all in a folder, point Get Data → Folder at it, and use the “Combine & Transform” path I mentioned earlier — that’s the auto-combine pattern, perfect for same-shape files.
For AtlasParts in this episode, we don’t have monthly splits — gl_journal.csv is a single file with the year in it. But the technique is worth seeing because it’s how real finance data arrives: the close team posts a file per month, and the analyst stitches twelve of them together each year.
Merge is for joining shapes (customers ⨝ countries, like SQL JOIN). Append is for stacking same-shape files (twelve months of GL, like SQL UNION ALL). Knowing which to reach for is half the Power Query craft once you’re past the typing-and-trimming basics.
Parameters: making the file path portable
One annoying habit of every Power Query script in Season 1 was the hard-coded file path. C:\Users\daryl\AtlasParts\customers.csv. If a colleague opened the file, refresh broke. Power BI gives us a clean fix: Parameters.
Home → Manage Parameters → New Parameter. Name it AtlasFolder, type Text, current value C:\AtlasParts\ (or wherever the folder lives on this machine).
Then go back to each query’s Source step. Open the formula bar — you’ll see something like Folder.Files("C:\AtlasParts"). Replace the literal path with the parameter: Folder.Files(AtlasFolder). Do this for all eight queries.
Now, when the file ships to a colleague (or to Power BI Service), there’s exactly one place where the path is set: the parameter. Change it once, every query picks up the new path on next refresh. This is the same idea as a named range in Excel — the value lives in one named cell, and every formula references it.
Parameters extend further. You can parametrise the data type (integer vs decimal), the date range, the environment (dev / prod). Anything that could vary across machines or runs is a parameter candidate. Finance teams commonly use a ReportingPeriod parameter so the same .pbix file can render the January close, the February close, or the YTD view, just by changing the parameter and refreshing.
Error handling: when a file refuses
Real CSVs are messy. A negative amount typed as (1,234.56) instead of -1234.56. A date in the format Jan 5, 2024 instead of 2024-01-05. A blank row that Power Query reads as a row of nulls.
Power Query has three patterns for handling these.
Replace Errors. Right-click a column, Replace Errors, type the replacement value. A column that should be a number but has stray text becomes a column of nulls (or zeros, or whatever you choose). Useful for “I know this column is meant to be numeric, just substitute the bad cells.”
Conditional Columns. When the cleanup is rule-based — “if is_active is anything except ‘Y’ or ‘N’, mark it ‘unknown’” — wrap the messy column in a conditional. Add Column → Conditional Column → “If is_active equals ‘Y’ then ‘Y’, else if ‘N’ then ‘N’, else ‘unknown’”. The new column is clean; you can keep or discard the original.
Try / Otherwise. For more complex cases, M’s try ... otherwise lets you wrap an expression that might fail. try Number.From([amount]) otherwise null — if the conversion succeeds, you get the number; if it fails, you get null instead of an error. This is the Power Query equivalent of a try/catch block in any other language.
The shift from Excel mindset is this: errors should be handled at the Power Query layer, not at the report layer. By the time data reaches the model, it should be clean. Visuals and DAX measures should never have to defend against a “value is text” error. Push the cleanup as far upstream as you can.
Close & Apply — what the model looks like now
Once every query is shaped — types right, columns clean, merges done — the closing step is Home → Close & Apply. Power Query writes every step to the .pbix file, runs each query against its source, and lands the resulting Tables in the Vertipaq model.
What does the model look like now? Switch to the Model view (the third icon on the View Switcher rail). Eight Tables, listed by name: customers, vendors, products, chart_of_accounts, gl_journal, ap_invoices, sales_orders, sales_order_lines. Each Table is a card with its column list. None of them have relationships drawn between them yet — that’s Episode 3’s job.
In the Data view (second icon), you can browse the rows that landed. customers has 487 rows. gl_journal has 4.1 million rows. The Vertipaq engine compressed those 4.1 million rows down to about 38 megabytes in memory — versus the 220-megabyte Excel workbook the same data would have produced.
Compression is the headline number, but the qualitative shift is more important: the model is loaded once, queried many times. Every visual you build, every DAX measure you write, every slicer the regional VP drags — they all run against the in-memory tables, not against the source CSVs. The CSVs are only revisited on refresh, which we’ll schedule in the Power BI Service later in the season.
Why this episode looks “easy”
If you’ve been through Season 1, this episode probably feels like familiar ground with a different button placement. That’s deliberate. Episode 7 made the case that Power BI was a different tool. Episode 8 — this one — was about showing that the first of its three engines is already a tool you know.
The hard parts of Power BI are the parts you haven’t met yet. Star schemas (Episode 3) — designing a model so DAX measures behave correctly. DAX itself (Episode 4) — a language with row context and filter context that catch every newcomer at least once. Time intelligence (Episode 5) — the small, specific incantations for YTD, MTD, prior-year. The full dashboard (Episode 6) — putting everything together and shipping a real report.
By starting with Power Query — the engine you already know — we get to load AtlasParts into the model without learning a new skill. Now we have something to point DAX at.
Recap and what’s next
Power Query in Power BI is the same M engine you used in Excel. Get Data → Folder loads many files at once. Merge joins shapes; Append stacks them. Parameters make file paths portable. Error handling belongs at the Power Query layer, not in your visuals. Close & Apply writes the queries to the file and lands the Tables in the Vertipaq model.
Eight Tables loaded. No relationships yet. No measures yet. Model view shows the cards floating, unlinked.
Episode 3 connects them. We’ll talk about why a finance model wants a star schema — one fact Table at the centre (gl_journal), dimension Tables around it (customers, products, chart_of_accounts, plus a calendar Table we’ll build) — and we’ll draw the relationships in Model view. By the end of Episode 3, AtlasParts will have a proper data model. Then DAX has somewhere to land.
For homework: open the AtlasParts starter file, add the countries.csv we discussed, build the merge, and click Close & Apply. Look at the eight Tables in Model view. Notice how empty the canvas feels with no relationships. That’s the gap Episode 3 fills.
Same engine. Bigger context. See you in the next one.