Power Query for Finance: Clean Once, Refresh Forever | Excel & Power BI S1 Ep5

0views
C
CelesteAI
Description
๐Ÿ“‚ Download the AtlasParts dataset, the Episode 5 starter / completed workbooks, and the .pq M-code files: https://github.com/GoCelesteAI/excel-powerbi-for-finance Episode 5 of *Excel & Power BI for Finance*. In Episode 4 we built the same panel as Episode 3 with a pivot โ€” same numbers, no formulas. At the very end we hit the ceiling: pivots are snapshots, they don't run on a schedule, and they break if the source data is dirty. The customer master in AtlasParts has the country `US` spelled five different ways. Our pivot dutifully made four columns for what should be one country. Numbers right. Report wrong. This episode is about Power Query โ€” the transformation layer between raw data and the report. You build the cleanup once, as a recipe of ordered steps. The recipe runs every time you click Refresh, on whatever data is sitting in the source file. Tomorrow's export goes through the same pipeline as today's. The customer master gets canonicalized. The blanks get filled. The header drift gets handled. And the pivot gets clean data without you babysitting it. The difference is between *cleaning your spreadsheet* and *building a process that cleans your spreadsheet*. What You'll Learn: - The dirty-data problem from Episode 3/4 โ€” why a correct pivot can still produce a wrong report when the source has 5 spellings of US - Power Query as a transformation layer โ€” raw data flows in, the recipe runs, clean data flows out, on every Refresh - Loading data from Text/CSV โ€” Get & Transform โ†’ Transform Data (not Load) opens the Power Query Editor - The three regions of the PQ Editor: Queries panel (left), Data Preview + M formula bar (centre), Applied Steps recipe (right) - The cleanup recipe โ€” Trim, Uppercase, Replace Values for canonicalization, Title-case, then re-assert column types - Each transformation is a separate, inspectable, reversible Applied Step โ€” much better than a giant nested IF formula - Close & Load โ€” the cleaned table loads to a new sheet and stays connected to the source - Refresh All โ€” re-runs the entire pipeline on whatever data is in the source file - Group By โ€” collapses 50 customer rows into 5 country totals, baked into the query (the pivot move, queryside) - Merge โ€” Power Query's join. The chained XLOOKUP from Episode 3 becomes two named Merge steps - The week-2 payoff โ€” when next month's export drops in, the recipe handles known dirty values automatically and surfaces new ones (like MEXICO) as obvious outliers instead of silently aggregating wrong - Why the recipe is the asset, not the cleaned file Timestamps: 0:00 - Intro โ€” Power Query 0:24 - What's in this episode 0:56 - The dirty customer master โ€” 5 spellings of US 1:48 - Why Power Query โ€” the pipeline metaphor 2:27 - Opening the Power Query Editor 3:22 - The cleanup recipe โ€” Trim, Uppercase, Replace, Type, Rename 4:42 - Close & Load + Refresh 5:30 - Group By and Merge โ€” the two finance moves 6:29 - The week-2 payoff 7:23 - Recap โ€” three takeaways 8:01 - Up next ยท Episode 6 ยท Robust Spreadsheets Key Takeaways: 1. Pivots can't fix dirty source data. Power Query is the layer that cleans before the pivot ever sees it. Every transformation is an Applied Step โ€” documented, inspectable, reversible. 2. The recipe is the asset, not the cleaned file. Trim, Uppercase, Replace Values, Set Types, Rename โ€” six steps and the customer master goes from 8 country variants to 4. Click Refresh and the recipe re-runs on whatever data is in the source path. 3. Group By and Merge are the two finance-critical operations. Group By collapses N rows into category totals (the pivot move, queryside). Merge replaces chained XLOOKUP โ€” two named steps instead of a nested formula. 4. The week-2 payoff: next month's export drops into the same path, you click Refresh All, the pipeline runs. Existing dirty values get cleaned. New ones show up as obvious outliers โ€” a new MEXICO column, not a silently aggregated mess. 5. Use a Table (Ctrl+T from Episode 4) as the source whenever possible. Set explicit column types instead of relying on inference. Both kill 90% of "the refresh broke" problems. Next episode: *Robust Spreadsheets โ€” Tables, named ranges, dynamic arrays, the discipline that makes a workbook hand-offable*. Closes Season 1 before we move to Power BI in Season 2. Taught by CelesteAI. #exceltutorial #powerbi #financetraining #financeprofessionals #powerquery #datacleanup #etl #excelforaccountants #cfo #excelandpowerbi #excelforfinance #financialanalysis #atlasparts #mcode #appliedsteps