Resampling: Daily to Weekly OHLC, Monthly Returns, Yearly Matrix | Pandas for Finance Ep10
0views
C
CelesteAI
Description
Episode 10 of Pandas for Finance. Time-frequency conversion — the most-used time-series tool in finance.
Source code: https://github.com/GoCelesteAI/pandas-for-finance
df.resample("W").agg({"Open": "first", "High": "max", "Low": "min", "Close": "last", "Volume": "sum"}) builds a weekly OHLC bar from daily prices in one line. df["Adj Close"].resample("ME").last().pct_change() gives you monthly returns. And df.groupby("Ticker")["Adj Close"].resample("YE").last().groupby(level=0).pct_change().unstack("Ticker") produces a year-by-year, ticker-by-ticker return matrix you'd recognize from any Bloomberg page. The thing that's brutal in Excel reduces to one chained line in pandas.
What You'll Build:
- resample.py — load the cached prices, set Date as the index, build a weekly OHLC bar from daily, then monthly returns, then a year-by-year return matrix across all 14 tickers.
- The OHLC aggregation pattern: .resample("W").agg({"Open": "first", "High": "max", "Low": "min", "Close": "last", "Volume": "sum"}) — each column folds with its own function.
- Monthly returns at month-end: .resample("ME").last().pct_change() — the standard month-over-month report shape.
- A 7-row × 14-column yearly return matrix: groupby + resample + groupby level 0 + pct_change + unstack — the chained pipeline that produces what every annual review prints.
Timestamps:
0:00 - Intro — Episode 10 starts here
0:19 - Preview — daily to weekly to monthly
0:58 - Open nvim, write resample.py
1:14 - Daily to weekly OHLC
1:30 - Monthly returns
1:43 - Yearly returns across 14 tickers
2:16 - Save and run
2:24 - Weekly OHLC tail
2:37 - Monthly returns
2:47 - Yearly leaderboard
3:08 - Recap
3:50 - End screen
Key Takeaways:
1. Resampling needs a DatetimeIndex. The first line of every resampling script: df.set_index("Date").sort_index(). Without that, .resample(...) errors. With it, every .resample(freq) call buckets rows by the frequency and lets you aggregate per bucket. Frequency aliases: D daily, W weekly, ME month-end, QE quarter-end, YE year-end. Pandas 2.2+ wants the uppercase suffix forms (ME not M); old code still works with a deprecation warning.
2. OHLC at any frequency = .resample(freq).agg(dict). Pass {"Open": "first", "High": "max", "Low": "min", "Close": "last", "Volume": "sum"} and each column folds with its own rule. The same pattern produces weekly bars from daily, monthly bars from weekly, or quarterly bars from any of those — change the freq string, the agg dict stays the same.
3. Monthly returns: .resample("ME").last().pct_change(). Last close of each month, then percent change vs the prior month. Two lines for month-over-month — the standard analyst question. For year-over-year on a row-by-row basis, .pct_change(252) looks back ~one trading year. For period-aligned YoY (calendar year vs prior calendar year), .resample("YE").last().pct_change() is cleaner.
4. Multi-ticker: groupby("Ticker") → resample → groupby(level=0) → pct_change. The groupby preserves the Ticker boundary so percent change doesn't bleed across tickers. The result is a MultiIndex Series; .unstack("Ticker") lifts the ticker level to columns and you have a 7-row × 14-ticker matrix in one chained line.
5. pd.Grouper(key="Date", freq="QE") when Date isn't the index. If you don't want to set the index, groupby([pd.Grouper(key="Date", freq="QE"), "Ticker"]) does both bucketing and grouping in one step. Useful for ad-hoc reports where you want the Date column kept around. For most production scripts, set the index up front and .resample directly — fewer keystrokes.
#Pandas #Python #Finance #Resample #OHLC #DataAnalytics #PythonForFinance #LearnPandas #DataFrame #TimeSeries
---
Generated by GoCelesteAI · part of the Pandas for Finance series