Pandas for Finance: Dates and the Trading Calendar: .dt, bdate_range, Calendar Gaps
Video: Dates and the Trading Calendar: .dt, bdate_range, Calendar Gaps | Pandas for Finance Ep9 by CelesteAI
pd.to_datetime(s)parses date strings..dt.year,.dt.month,.dt.day_name()extract pieces.pd.bdate_range(start, end)for business days.pd.tseries.offsets.BDay()to add business days. The calendar quirks that bite analysts.
If your Date column is strings, half of pandas’s time superpowers don’t work. Step one: convert to datetime64[ns].
Convert to datetime
import pandas as pd
df = pd.read_parquet("data/prices.parquet")
print(df["Date"].dtype)
# datetime64[ns] — already converted by parquet
If you read a CSV, you’d often get strings:
csv = pd.read_csv("prices.csv")
csv["Date"] = pd.to_datetime(csv["Date"])
For non-standard formats:
pd.to_datetime(s, format="%d/%m/%Y") # explicit format = fastest
pd.to_datetime(s, errors="coerce") # NaT on failure (default raises)
pd.to_datetime(s, dayfirst=True) # "01/02/2024" → Feb 1
errors="coerce" is golden for messy data — bad rows become NaT (Not a Time) instead of crashing.
.dt accessor
A datetime Series has the .dt accessor for components:
df["Year"] = df["Date"].dt.year
df["Month"] = df["Date"].dt.month
df["Day"] = df["Date"].dt.day
df["Weekday"] = df["Date"].dt.day_name()
df["Quarter"] = df["Date"].dt.quarter
df["DayOfYear"] = df["Date"].dt.dayofyear
df["WeekOfYear"] = df["Date"].dt.isocalendar().week
Common ones:
| Accessor | Result |
|---|---|
.year .month .day |
int |
.hour .minute .second |
int |
.day_name() |
“Monday” |
.month_name() |
“January” |
.weekday |
0–6 (Mon=0) |
.dayofyear |
1–366 |
.quarter |
1–4 |
.is_month_end |
bool |
.is_quarter_end |
bool |
.is_year_end |
bool |
Date arithmetic
from datetime import timedelta
# Tomorrow
df["NextDay"] = df["Date"] + timedelta(days=1)
# 30 days from now
df["In30Days"] = df["Date"] + pd.Timedelta(days=30)
# 30 business days from now
from pandas.tseries.offsets import BDay
df["NextMonthBD"] = df["Date"] + BDay(30)
Timedelta for calendar arithmetic; BDay (or BusinessDay) for trading days.
Business day ranges
biz = pd.bdate_range("2024-01-01", "2024-12-31")
print(len(biz)) # 261 trading days in 2024 (excluding weekends only)
bdate_range skips weekends. For US market holidays:
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
trading = pd.bdate_range("2024-01-01", "2024-12-31", freq=us)
print(len(trading)) # 251 — closer to actual NYSE trading days
For NYSE-precise calendars, install exchange_calendars or pandas_market_calendars. Pandas’s USFederalHoliday is close but not identical (NYSE has different holidays from federal: Good Friday, etc.).
Filtering by date
df = df.set_index("Date")
# January 2024
jan = df.loc["2024-01"]
# Q1 2024
q1 = df.loc["2024-01":"2024-03"]
# Year only
y2023 = df.loc["2023"]
DatetimeIndex partial-string indexing is a superpower — pandas resolves "2024-01" to “all of January 2024.”
Detecting gaps
df["AAPL"] = df["Ticker"] == "AAPL"
aapl = df[df["AAPL"]].sort_index()
# Day-over-day gap
aapl["Gap"] = aapl.index.to_series().diff().dt.days
gaps = aapl[aapl["Gap"] > 3] # > 3 days = weekend + Mon holiday at minimum
print(gaps[["Gap"]].head())
The largest gaps are usually 3-day weekends (Mon holiday) or year-ends.
Time zones
For US market data, dates are usually naive (no TZ). For intraday or multi-market analysis, localize:
df.index = df.index.tz_localize("America/New_York")
df.index = df.index.tz_convert("UTC")
Once localized, comparisons across markets are unambiguous.
For our daily AAPL series, naive is fine.
Period vs Timestamp
| Type | Meaning | Example |
|---|---|---|
| Timestamp | Specific instant | 2024-01-15 09:30:00 |
| Period | A span | 2024-01 (the whole month) |
ts = pd.Timestamp("2024-01-15")
p = pd.Period("2024-01", freq="M")
ts in p # True — Jan 15 is in the January period
p.start_time # 2024-01-01 00:00:00
p.end_time # 2024-01-31 23:59:59
For monthly aggregations, periods are cleaner than dates (“the month of Jan 2024” vs “Jan 1, 2024 specifically”).
Converting between
# Timestamp → Period
df["Date"].dt.to_period("M") # Period('2024-01', 'M')
# Period → Timestamp (start)
period_index.to_timestamp() # back to first-of-month Timestamps
Trading calendar filtering
from pandas.tseries.offsets import BDay
# All Mondays in 2024
mondays = pd.bdate_range("2024-01-01", "2024-12-31").to_series()
mondays = mondays[mondays.dt.weekday == 0]
print(mondays.head())
# All month-ends (last business day)
month_ends = pd.bdate_range("2024-01-01", "2024-12-31", freq="BME")
print(month_ends)
Frequency strings: B (business day), M (calendar month), ME (calendar month end), BME (business month end), Q (quarter), BQE (business quarter end), Y (year), BYE (business year end).
Try it
dates.py:
import pandas as pd
pd.set_option("display.float_format", lambda x: f"{x:,.2f}")
df = pd.read_parquet("data/prices.parquet")
aapl = df[df["Ticker"] == "AAPL"].copy().sort_values("Date")
print("=== Date components ===")
aapl["Year"] = aapl["Date"].dt.year
aapl["Month"] = aapl["Date"].dt.month_name()
aapl["Weekday"] = aapl["Date"].dt.day_name()
aapl["Quarter"] = aapl["Date"].dt.quarter
print(aapl[["Date", "Year", "Month", "Weekday", "Quarter"]].head())
print("\n=== Business days in 2024 ===")
biz = pd.bdate_range("2024-01-01", "2024-12-31")
print(f"{len(biz)} business days")
print("\n=== Average return by weekday ===")
aapl["Return"] = aapl["Close"].pct_change()
by_day = aapl.groupby("Weekday")["Return"].mean()
order = ["Monday","Tuesday","Wednesday","Thursday","Friday"]
print(by_day.reindex(order).map(lambda x: f"{x:+.4%}"))
print("\n=== Detecting calendar gaps ===")
aapl_idx = aapl.set_index("Date")
gaps = aapl_idx.index.to_series().diff().dt.days
big_gaps = gaps[gaps > 3]
print(big_gaps.head())
Common stumbles
.dt on a string Series. df["Date"].dt.year errors. pd.to_datetime(df["Date"]) first.
BDay() ignores holidays. It only skips weekends. Use CustomBusinessDay(calendar=USFederalHolidayCalendar()) for US market days, or exchange_calendars for NYSE-exact.
isocalendar() returns a DataFrame. Three columns: year, week, day. To get just week: df["Date"].dt.isocalendar().week.
Time zone confusion. Mix of naive and TZ-aware Timestamps errors on comparison. Pick one and stick with it.
Locale-dependent month names. .month_name() defaults to your locale. Explicit: .month_name(locale="en_US").
MS vs M vs ME confusion. Pandas 2.2+ deprecated some aliases. Use ME (month-end) or MS (month-start) explicitly.
What’s next
Ep 10: resampling — daily → weekly → monthly OHLC.
Recap
pd.to_datetime(s) for parsing; errors="coerce" for messy data. .dt.year, .dt.month, .dt.day_name(), .dt.is_month_end for components. Date arithmetic with Timedelta (calendar) or BDay (trading). pd.bdate_range for trading-day ranges; add CustomBusinessDay(calendar=...) for holidays. Period for time spans, Timestamp for instants. DatetimeIndex partial-string indexing (df.loc["2024-01"]) is a superpower.
Next episode: resampling.