Pandas for Finance: Returns: Daily, Wealth Curve, Annualized, Sharpe, Drawdown
Video: Returns: Daily, Wealth Curve, Annualized, Sharpe, Drawdown | Pandas for Finance Ep6 by CelesteAI
df["Close"].pct_change()— daily return.np.log(df["Close"] / df["Close"].shift())— log return.(1 + r).cumprod() - 1— cumulative return / wealth curve. The three return calculations every analyst writes.
A “return” is the percent change in price. Three flavors: simple/daily, log, cumulative. Each has the use case it’s right for.
Daily simple return
import pandas as pd
df = pd.read_parquet("data/prices.parquet")
aapl = df[df["Ticker"] == "AAPL"].copy().sort_values("Date")
aapl["Return"] = aapl["Close"].pct_change()
print(aapl[["Date", "Close", "Return"]].head())
Date Close Return
0 2020-01-02 72.876 NaN
1 2020-01-03 72.167 -0.0097
2 2020-01-06 72.741 0.0080
3 2020-01-07 72.399 -0.0047
4 2020-01-08 73.564 0.0161
pct_change() computes (today - yesterday) / yesterday for the whole column. The first row is NaN (no prior).
For a 5-day return:
aapl["Return5d"] = aapl["Close"].pct_change(5)
pct_change(N) looks back N rows.
Log return
import numpy as np
aapl["LogReturn"] = np.log(aapl["Close"] / aapl["Close"].shift())
shift() lifts the column down one row (so today’s row sees yesterday’s price). Divide, take log.
Simple vs log:
| Simple return | Log return | |
|---|---|---|
| Bound | -100% to ∞ | -∞ to ∞ |
| Multi-period | multiply | add |
| Symmetric | no (–50% then +50% ≠ 0) | yes |
| Standard for | reporting, P&L | math, statistics, vol calc |
For statistical work (vol, Sharpe, etc.), log returns. For “what was the return” reporting, simple. Both come from the same data; pick what fits the context.
Cumulative return
The wealth curve — “what would $1 invested at the start be worth today?”
aapl["CumReturn"] = (1 + aapl["Return"]).cumprod() - 1
(1 + return).cumprod() is the running product of growth factors. Subtract 1 for “return on the original $1.”
For an actual dollar amount:
aapl["Wealth"] = (1 + aapl["Return"].fillna(0)).cumprod()
print(aapl[["Date", "Close", "Wealth"]].tail())
fillna(0) handles the NaN first row (return on day 0 is 0).
Date Close Wealth
1255 2024-12-30 257.16 3.5290
1256 2024-12-31 258.39 3.5460
A dollar in AAPL at start of 2020 is worth ~$3.55 at end of 2024. (Adjusted for splits if you used Adj Close — see “Splits & dividends” below.)
Annualizing
Trading days per year ≈ 252.
mean_daily = aapl["Return"].mean()
std_daily = aapl["Return"].std()
annual_return = (1 + mean_daily) ** 252 - 1
annual_vol = std_daily * (252 ** 0.5)
print(f"Annualized return: {annual_return:.2%}")
print(f"Annualized vol: {annual_vol:.2%}")
Annualized return: 32.45%
Annualized vol: 27.60%
For the Sharpe ratio (risk-adjusted return), assuming a 4% risk-free rate:
sharpe = (annual_return - 0.04) / annual_vol
print(f"Sharpe: {sharpe:.2f}")
The four numbers — annual return, annual vol, max drawdown, Sharpe — are the standard performance summary.
Drawdown
The peak-to-trough decline:
wealth = (1 + aapl["Return"].fillna(0)).cumprod()
peak = wealth.expanding().max()
drawdown = wealth / peak - 1
print(f"Max drawdown: {drawdown.min():.2%}")
print(f"Max DD date: {drawdown.idxmin()}")
expanding().max() is the running peak. wealth / peak - 1 is the percent below peak today. The minimum is the worst.
All tickers at once
df = df.sort_values(["Ticker", "Date"])
df["Return"] = df.groupby("Ticker")["Close"].pct_change()
groupby("Ticker") ensures the percent change resets at each ticker boundary — Apple’s first day’s NaN doesn’t reach back to the previous ticker’s last day.
# Annualized return per ticker
annual = (
df.groupby("Ticker")["Return"]
.agg(lambda r: (1 + r.mean()) ** 252 - 1)
.sort_values(ascending=False)
)
print(annual)
A leaderboard in 4 lines.
Splits & dividends — Adj Close
Yahoo gives you both Close and Adj Close:
- Close — the actual closing price.
- Adj Close — adjusted for splits and dividends, so historical comparisons are valid.
For returns, always use Adj Close. Otherwise a 4:1 split looks like a 75% drop.
df["Return"] = df.groupby("Ticker")["Adj Close"].pct_change()
For our parquet (regenerated by scripts/regenerate-cache.py), Adj Close is preserved.
Try it
returns.py:
import numpy as np
import pandas as pd
pd.set_option("display.float_format", lambda x: f"{x:,.4f}")
df = pd.read_parquet("data/prices.parquet").sort_values(["Ticker", "Date"])
df["Return"] = df.groupby("Ticker")["Adj Close"].pct_change()
aapl = df[df["Ticker"] == "AAPL"].copy()
print("=== Daily returns (head) ===")
print(aapl[["Date", "Adj Close", "Return"]].head())
print("\n=== Wealth curve ===")
aapl["Wealth"] = (1 + aapl["Return"].fillna(0)).cumprod()
print(aapl[["Date", "Wealth"]].tail())
print("\n=== Stats ===")
mean_daily = aapl["Return"].mean()
std_daily = aapl["Return"].std()
annual_return = (1 + mean_daily) ** 252 - 1
annual_vol = std_daily * (252 ** 0.5)
sharpe = (annual_return - 0.04) / annual_vol
print(f"Annualized return: {annual_return:.2%}")
print(f"Annualized vol: {annual_vol:.2%}")
print(f"Sharpe (rf=4%): {sharpe:.2f}")
print("\n=== Drawdown ===")
peak = aapl["Wealth"].expanding().max()
dd = aapl["Wealth"] / peak - 1
print(f"Max drawdown: {dd.min():.2%}")
print(f"Date: {aapl.loc[dd.idxmin(), 'Date']}")
print("\n=== All tickers, annualized return ===")
ann = (
df.groupby("Ticker")["Return"]
.agg(lambda r: (1 + r.mean()) ** 252 - 1)
.sort_values(ascending=False)
)
print(ann)
Common stumbles
pct_change across tickers without groupby. If you don’t groupby, the percent change between Ticker A’s last row and Ticker B’s first row is nonsense. Always df.groupby("Ticker")["Close"].pct_change() for long format.
Forgetting to sort. pct_change follows row order. Sort by Date (and by Ticker before that for long format) before computing.
Using Close instead of Adj Close. Splits and dividends crater your historical returns. Always Adj Close for performance metrics; Close for “what did it actually trade at.”
252 vs 365. 252 trading days per year, not 365. Calendar-day annualization understates vol.
Cumulative return without fillna(0). First row NaN poisons the whole cumprod. fillna(0) (or dropna() first).
Comparing Sharpes across periods. Sharpe assumes normal returns and a stable rf. Use rolling Sharpe for a fairer view.
Geometric vs arithmetic mean. (1+r).mean() ** N - 1 is wrong; you want geometric mean: (1+r).prod() ** (1/N) - 1 or use cumprod tail.
What’s next
Ep 7: groupby on time. Monthly stats, year-over-year, the analyst’s pivot table.
Recap
pct_change() for daily returns; groupby("Ticker").pct_change() for long format. (1 + r).cumprod() - 1 for cumulative return. np.log(close / close.shift()) for log returns (use for math, vol). Annualize with * 252 (return) or * sqrt(252) (vol). Drawdown = wealth / wealth.expanding().max() - 1. Always Adj Close for performance.
Next episode: groupby on time.