Pandas for Finance: Capstone Backtest: Equal-Weight Portfolio vs SPY in Pandas
Video: Capstone Backtest: Equal-Weight Portfolio vs SPY in Pandas | Pandas for Finance Ep14 by CelesteAI
Equal-weight 5 stocks, monthly rebalanced. Compute the wealth curve, Sharpe, max drawdown, vs SPY. The “I want to learn pandas to build a backtester” moment — assembled from everything in the previous 13 episodes.
This is the capstone: load → clean → compute returns → build a portfolio → benchmark → report. Real, end-to-end.
The setup
import pandas as pd
PORTFOLIO = ["AAPL", "MSFT", "GOOGL", "AMZN", "NVDA"]
BENCHMARK = "SPY"
RF = 0.04 # risk-free rate, for Sharpe
df = pd.read_parquet("data/prices.parquet")
df["Date"] = pd.to_datetime(df["Date"])
df = df.sort_values(["Ticker", "Date"])
Five mega-caps. Equal weight (20% each). Monthly rebalance back to equal weight.
Pivot to wide
For portfolio math, wide format is easier — one column per ticker:
wide = (
df[df["Ticker"].isin(PORTFOLIO + [BENCHMARK])]
.pivot(index="Date", columns="Ticker", values="Adj Close")
.ffill()
)
print(wide.tail(3))
Ticker AAPL AMZN GOOGL MSFT NVDA SPY
Date
2024-12-30 252.20 219.39 191.59 416.76 134.29 586.08
2024-12-31 250.42 219.39 189.30 421.50 134.29 586.08
Each column is a ticker; index is date.
Daily returns
returns = wide.pct_change().dropna()
print(returns.head())
pct_change() works on each column. Drop the first row (NaN).
Equal-weight portfolio
weights = pd.Series(1 / len(PORTFOLIO), index=PORTFOLIO)
print(weights)
# AAPL 0.2
# AMZN 0.2
# GOOGL 0.2
# MSFT 0.2
# NVDA 0.2
portfolio_returns = (returns[PORTFOLIO] * weights).sum(axis=1)
Multiply each column by its weight (broadcasts), sum across columns. The portfolio’s daily return — still in long-only, no leverage.
Add the benchmark
benchmark_returns = returns[BENCHMARK]
combined = pd.DataFrame({
"Portfolio": portfolio_returns,
"Benchmark": benchmark_returns,
})
print(combined.head())
Two return streams aligned by date.
Wealth curves
combined["PortfolioWealth"] = (1 + combined["Portfolio"]).cumprod()
combined["BenchmarkWealth"] = (1 + combined["Benchmark"]).cumprod()
print(combined[["PortfolioWealth", "BenchmarkWealth"]].iloc[[0, -1]])
Date
2018-01-04 1.0000 1.0000
2024-12-31 5.7124 2.0892
Five-bagger vs roughly doubling. Dramatic — but heavily weighted toward the post-2019 tech rally.
Performance metrics
def stats(returns: pd.Series, rf: float = 0.04) -> dict:
mu = returns.mean()
sigma = returns.std()
ann_return = (1 + mu) ** 252 - 1
ann_vol = sigma * (252 ** 0.5)
sharpe = (ann_return - rf) / ann_vol if ann_vol > 0 else float("nan")
wealth = (1 + returns.fillna(0)).cumprod()
peak = wealth.expanding().max()
drawdown = wealth / peak - 1
max_dd = drawdown.min()
return {
"AnnReturn": ann_return,
"AnnVol": ann_vol,
"Sharpe": sharpe,
"MaxDD": max_dd,
}
port_stats = stats(combined["Portfolio"], rf=RF)
bench_stats = stats(combined["Benchmark"], rf=RF)
report = pd.DataFrame({"Portfolio": port_stats, "Benchmark": bench_stats}).round(4)
print(report)
Portfolio Benchmark
AnnReturn 0.2876 0.1132
AnnVol 0.2455 0.1820
Sharpe 1.01 0.40
MaxDD -0.3540 -0.3392
Portfolio crushed the benchmark on return, with ~35% more vol — Sharpe still wins. Max drawdown ~35% in both (the 2022 bear).
Monthly rebalancing
Daily rebalancing back to equal weight is unrealistic. Real portfolios rebalance monthly or quarterly:
# Resample to month-end
monthly_close = wide[PORTFOLIO].resample("ME").last()
monthly_returns_per_stock = monthly_close.pct_change().dropna()
# Each month, equal weights
monthly_port_return = (monthly_returns_per_stock * weights).sum(axis=1)
monthly_wealth = (1 + monthly_port_return).cumprod()
print(monthly_wealth.tail())
Result is a monthly-rebalanced portfolio: at each month-end, sell winners, buy losers, return to 20% each.
In practice, monthly rebalancing modestly underperforms holding (in trending markets) and modestly outperforms in mean-reverting markets. For our 7 years, it’s roughly the same.
Buy-and-hold (no rebalance)
# Each stock's wealth curve, equally-weighted on day 1 only
initial_value = 1 / len(PORTFOLIO)
stock_wealth = wide[PORTFOLIO].divide(wide[PORTFOLIO].iloc[0])
buy_hold = stock_wealth.sum(axis=1) * initial_value
print(buy_hold.tail())
Day 1: each stock gets 20% of $1. Then let it ride. Tickers that grew faster end up larger weights (NVDA dominates by 2024).
Compare to equal-weight monthly:
print("Buy-and-hold final:", buy_hold.iloc[-1])
print("Monthly EW final:", (1 + monthly_port_return).cumprod().iloc[-1])
Per-ticker contribution
Where did the returns come from?
contributions = stock_wealth.iloc[-1] - 1
contributions = contributions / contributions.sum()
print((contributions * 100).round(1))
Spoiler: NVDA, then AAPL, then MSFT.
The full backtest in 50 lines
import pandas as pd
PORTFOLIO = ["AAPL", "MSFT", "GOOGL", "AMZN", "NVDA"]
BENCHMARK = "SPY"
RF = 0.04
# Load + pivot
df = pd.read_parquet("data/prices.parquet")
df["Date"] = pd.to_datetime(df["Date"])
wide = (
df[df["Ticker"].isin(PORTFOLIO + [BENCHMARK])]
.pivot(index="Date", columns="Ticker", values="Adj Close")
.ffill()
)
# Returns
returns = wide.pct_change().dropna()
# Equal-weight portfolio
weights = pd.Series(1 / len(PORTFOLIO), index=PORTFOLIO)
port_ret = (returns[PORTFOLIO] * weights).sum(axis=1)
bench_ret = returns[BENCHMARK]
def stats(r, rf=RF):
mu, sd = r.mean(), r.std()
ann_r = (1 + mu) ** 252 - 1
ann_v = sd * (252 ** 0.5)
wealth = (1 + r.fillna(0)).cumprod()
dd = (wealth / wealth.expanding().max() - 1).min()
return {
"AnnReturn": f"{ann_r:.2%}",
"AnnVol": f"{ann_v:.2%}",
"Sharpe": f"{(ann_r - rf) / ann_v:.2f}",
"MaxDD": f"{dd:.2%}",
}
report = pd.DataFrame({
"Portfolio": stats(port_ret),
"Benchmark (SPY)": stats(bench_ret),
})
print(report)
# Wealth curves to CSV for plotting
out = pd.DataFrame({
"Portfolio": (1 + port_ret).cumprod(),
"Benchmark": (1 + bench_ret).cumprod(),
})
out.to_csv("backtest_wealth.csv")
print("Wrote backtest_wealth.csv")
End-to-end. Load, pivot, weight, aggregate, report, save. The 50 lines that replace 200 lines of Excel and 50 SQL queries.
Try it
backtest.py:
import pandas as pd
PORTFOLIO = ["AAPL", "MSFT", "GOOGL", "AMZN", "NVDA"]
BENCHMARK = "SPY"
RF = 0.04
df = pd.read_parquet("data/prices.parquet")
df["Date"] = pd.to_datetime(df["Date"])
wide = (
df[df["Ticker"].isin(PORTFOLIO + [BENCHMARK])]
.pivot(index="Date", columns="Ticker", values="Adj Close")
.ffill()
)
returns = wide.pct_change().dropna()
weights = pd.Series(1 / len(PORTFOLIO), index=PORTFOLIO)
port_ret = (returns[PORTFOLIO] * weights).sum(axis=1)
bench_ret = returns[BENCHMARK]
def stats(r, rf=RF):
mu, sd = r.mean(), r.std()
ann_r = (1 + mu) ** 252 - 1
ann_v = sd * (252 ** 0.5)
wealth = (1 + r.fillna(0)).cumprod()
dd = (wealth / wealth.expanding().max() - 1).min()
return {
"AnnReturn": f"{ann_r:.2%}",
"AnnVol": f"{ann_v:.2%}",
"Sharpe": f"{(ann_r - rf) / ann_v:.2f}",
"MaxDD": f"{dd:.2%}",
}
report = pd.DataFrame({
"Portfolio": stats(port_ret),
"Benchmark (SPY)": stats(bench_ret),
})
print(report)
out = pd.DataFrame({
"Portfolio": (1 + port_ret).cumprod(),
"Benchmark": (1 + bench_ret).cumprod(),
})
out.to_csv("/tmp/backtest_wealth.csv")
print("Wrote /tmp/backtest_wealth.csv")
Common stumbles
Look-ahead bias. Using day N’s close to “buy at day N’s open” — impossible IRL. Lag execution by one day if simulating fills.
Survivorship bias. Our universe excludes tickers that delisted/went bankrupt. Real backtests need a survivor-free universe.
Ignoring transaction costs. Daily rebalancing in the real world costs basis points. Add a fee per turnover.
Pivot misalignment. If a ticker doesn’t trade on a date, it’s NaN in wide. ffill() handles this — without it, your portfolio return becomes NaN.
Annualizing wrong. mu * 252 for simple returns is approximate; geometrically correct: (1 + mu) ** 252 - 1.
Starting from 0% return on day 1. First row’s pct_change is NaN. dropna() or fillna(0).
Forgetting the benchmark. Always compare to a passive benchmark. Outperforming SPY by 2 points means you earned alpha; matching it means you got lucky and paid fees.
What’s next
Ep 15: pandas + DuckDB at scale. When pandas slows on big universes, hand off SQL to DuckDB.
Recap
End-to-end backtest in pandas: load → pivot wide → daily returns → weighted portfolio return → wealth curve → Sharpe + drawdown. (returns[tickers] * weights).sum(axis=1) for daily portfolio return. (1 + r).cumprod() for wealth. wealth / wealth.expanding().max() - 1 for drawdown. Always benchmark against a passive ETF (SPY, ACWI, AGG) — beating the market is the only meaningful claim.
Next episode: pandas + DuckDB at scale.