Part of Pandas for Finance

Pandas for Finance: Resampling: Daily to Weekly OHLC, Monthly Returns, Yearly Matrix

Celest KimCelest Kim

Video: Resampling: Daily to Weekly OHLC, Monthly Returns, Yearly Matrix | Pandas for Finance Ep10 by CelesteAI

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

df.resample("ME").last() — month-end last value. df.resample("W").agg({"High":"max","Low":"min","Close":"last"}) — weekly OHLC bars from daily. df.shift(252) for year-over-year. The thing that’s brutal in Excel.

Resampling changes the time frequency: daily → weekly → monthly → yearly, or the other direction. It’s the most-used time-series tool in finance.

Setup: DatetimeIndex

Resampling needs a DatetimeIndex.

import pandas as pd

df = pd.read_parquet("data/prices.parquet")
aapl = (
  df[df["Ticker"] == "AAPL"]
    .copy()
    .sort_values("Date")
    .set_index("Date")
)

Now aapl.index is a DatetimeIndex, and resampling works.

Downsample: daily → weekly

weekly_close = aapl["Close"].resample("W").last()
print(weekly_close.head())
Date
2018-01-07     43.41
2018-01-14     44.27
2018-01-21     44.50
2018-01-28     43.82
2018-02-04     38.59

.resample(freq) buckets rows by frequency. .last() aggregates each bucket. Other aggs:

aapl["Close"].resample("W").mean()    # average close per week
aapl["Close"].resample("W").max()     # weekly high based on close
aapl["Volume"].resample("W").sum()    # total weekly volume

Custom OHLC bar

weekly_ohlc = aapl[["Open", "High", "Low", "Close", "Volume"]].resample("W").agg({
  "Open":   "first",
  "High":   "max",
  "Low":    "min",
  "Close":  "last",
  "Volume": "sum",
})
print(weekly_ohlc.tail())

The classic candle bar at a different frequency. Each .agg(dict) keys says how to fold that column.

Frequency aliases

Alias Meaning
"D" daily
"B" business day
"W" weekly (default Sunday end)
"W-FRI" weekly Friday end
"ME" month end
"MS" month start
"BME" business month end
"QE" quarter end
"BQE" business quarter end
"YE" year end

Pandas 2.2+ uses uppercase suffixes like ME, QE, YE. Older code shows M, Q, Y — still works, deprecation warning.

Monthly returns from daily

monthly_close = aapl["Adj Close"].resample("ME").last()
monthly_return = monthly_close.pct_change()
print(monthly_return.tail())

The “what’s my month-over-month return” calculation in two lines.

For multiple tickers:

df = df.set_index("Date").sort_index()

monthly = (
  df.groupby("Ticker")["Adj Close"]
    .resample("ME").last()
)
print(monthly.head())

groupby + resample is a common combo. Result is a Series with MultiIndex (Ticker, Date).

To compute monthly returns within each ticker:

monthly_ret = (
  df.groupby("Ticker")["Adj Close"]
    .resample("ME").last()
    .groupby(level=0)
    .pct_change()
)

groupby(level=0) reuses the Ticker level for the second groupby.

Upsampling: weekly → daily

Less common in finance but shows up:

weekly = aapl["Close"].resample("W").last()
daily = weekly.resample("D").ffill()    # forward-fill: each day = last week's value

Upsampling needs a fill strategy: - ffill() — forward fill (carry last value). - bfill() — backward fill. - interpolate() — linear interpolation.

For prices, ffill is usually right (no trading on the day → carry last close).

Year-over-year

aapl["YoY"] = aapl["Adj Close"].pct_change(252)

.pct_change(N) looks back N rows. With ~252 trading days per year, pct_change(252) is “today vs ~one year ago.”

For period-aligned YoY (calendar year):

yoy = (
  aapl["Adj Close"]
    .resample("YE").last()
    .pct_change()
)
print(yoy)

Calendar-year-end vs prior calendar-year-end.

Quarter-end snapshots

quarterly = (
  df.groupby("Ticker")["Adj Close"]
    .resample("QE").last()
    .unstack("Ticker")
)
print(quarterly.tail())

A snapshot every quarter — what the press release would quote.

Resampling and grouping in one go

df["Date"] = pd.to_datetime(df["Date"])

# Quarter × Ticker close
q = (
  df.groupby([pd.Grouper(key="Date", freq="QE"), "Ticker"])["Adj Close"]
    .last()
    .unstack("Ticker")
)

pd.Grouper(key="Date", freq="QE") is “groupby this column at this frequency.” Useful when the date isn’t the index.

Custom aggregation

# Annualized volatility, computed monthly
monthly_vol = (
  aapl["Close"].pct_change()
    .resample("ME")
    .agg(lambda r: r.std() * (252 ** 0.5))
)
print(monthly_vol.tail())

Standard deviation per month, scaled to annual. Any function that takes a Series and returns a scalar works.

Try it

resample.py:

import pandas as pd

pd.set_option("display.float_format", lambda x: f"{x:,.4f}")

df = pd.read_parquet("data/prices.parquet")
aapl = df[df["Ticker"] == "AAPL"].copy().set_index("Date").sort_index()

print("=== Weekly OHLC ===")
weekly = aapl[["Open", "High", "Low", "Close", "Volume"]].resample("W").agg({
  "Open":  "first",
  "High":  "max",
  "Low":   "min",
  "Close": "last",
  "Volume": "sum",
})
print(weekly.tail())

print("\n=== Monthly returns ===")
monthly_close = aapl["Adj Close"].resample("ME").last()
monthly_ret = monthly_close.pct_change()
print(monthly_ret.tail(8).map(lambda x: f"{x:+.2%}" if pd.notna(x) else "—"))

print("\n=== Yearly returns (all tickers) ===")
yearly = (
  df.set_index("Date").groupby("Ticker")["Adj Close"]
    .resample("YE").last()
    .groupby(level=0)
    .pct_change()
    .dropna()
    .unstack("Ticker")
)
print(yearly.map(lambda x: f"{x:+.2%}" if pd.notna(x) else "—"))

print("\n=== Year-over-year (rolling 252-day) for AAPL ===")
aapl["YoY"] = aapl["Adj Close"].pct_change(252)
print(aapl["YoY"].tail().map(lambda x: f"{x:+.2%}"))

Common stumbles

Resampling without a DatetimeIndex. Errors. df.set_index("Date") first.

M deprecation. Pandas 2.2+ wants ME (month-end) explicit. Old code shows M and works with a warning. Fix: replace MME, QQE, YYE.

groupby + resample creates MultiIndex. Result has (Ticker, Date) as index. .unstack("Ticker") to spread tickers across columns.

first and last on NaN. They skip NaN. Sometimes you want first non-NaN which is the same; for “first regardless,” use .head(1) per group.

Resampling shifts the bucket label. By default month-end goes to the last day of the month. label="left" to put it on the first.

Forward-fill across tickers. Naïve df["Adj Close"].ffill() fills cross-ticker boundaries. Use df.groupby("Ticker")["Adj Close"].ffill() instead.

What’s next

Ep 11: rolling windows — moving averages, Bollinger bands, drawdown.

Recap

df.resample(freq).agg(func) for time-frequency conversion. "ME" "QE" "YE" "W" are the common targets. Custom OHLC bar via agg({"Open":"first","High":"max","Low":"min","Close":"last"}). For multi-ticker, df.groupby("Ticker")["col"].resample(freq).func() then groupby(level=0).pct_change() for within-ticker returns. pd.Grouper(key="Date", freq="QE") if Date isn’t the index.

Next episode: rolling windows.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.