Dates and the Trading Calendar: .dt, bdate_range, Calendar Gaps | Pandas for Finance Ep9
0views
C
CelesteAI
Description
Episode 9 of Pandas for Finance. The calendar quirks that bite finance scripts.
Source code: https://github.com/GoCelesteAI/pandas-for-finance
If your Date column is a string, half of pandas's time superpowers don't work. Step one: pd.to_datetime. Once the dtype is datetime64[ns], the .dt accessor unlocks year, month, weekday, quarter, day-of-year, and a dozen .is_*_end booleans for free. The episode walks through Apple's weekday returns, the count of business days in 2024, and a sneaky-clean trick — diffing dates and looking for jumps over 3 days reveals every Monday market holiday in the dataset.
What You'll Build:
- dates.py — load AAPL, extract Year/Weekday/Quarter via .dt, then group by Weekday and rank the daily returns Mon-Fri.
- A trading-day count for 2024 with pd.bdate_range (262 weekend-only business days; about 10 fewer once you swap in CustomBusinessDay(calendar=USFederalHolidayCalendar())).
- A holiday-detection trick: index.to_series().diff().dt.days.gt(3) flags every long weekend in the dataset — MLK, Presidents, Good Friday, Memorial, Labor Day all visible.
- The reorder pattern: groupby("Weekday").mean().reindex(["Monday","Tuesday",...]) so the days print in calendar order, not alphabetical.
Timestamps:
0:00 - Intro — Episode 9 starts here
0:19 - Preview — .dt, bdate_range, gaps
0:58 - Open nvim, write dates.py
1:12 - .dt accessor: year, weekday, quarter
1:24 - Average return by weekday
1:38 - Business-day calendar
1:51 - Calendar gaps = Monday holidays
2:04 - Save and run
2:10 - Date components output
2:22 - Weekday returns
2:37 - Trading days in 2024
2:50 - Recap
3:29 - End screen
Key Takeaways:
1. pd.to_datetime is the gateway. .dt.year, .dt.month, .dt.day_name(), .dt.is_month_end — none of it works while the dtype is object/string. Convert first, then everything else flows. For messy dates, errors="coerce" turns parse failures into NaT instead of raising; for non-standard formats, pass format="%d/%m/%Y" for a 5-10x speed boost over the default infer-from-each-value behavior.
2. .dt is the analyst's date-piece extractor. .dt.year, .dt.month, .dt.day_name() for ordinary components. .dt.quarter, .dt.dayofyear, .dt.weekday (0=Mon) for less obvious ones. .dt.is_month_end, .dt.is_quarter_end, .dt.is_year_end for the booleans that drive end-of-period rebalancing logic. Every analyst report that buckets by "month" or "quarter" runs through this accessor.
3. pd.bdate_range(start, end) for business-day ranges. 262 weekdays in 2024. For US market days specifically, swap to CustomBusinessDay(calendar=USFederalHolidayCalendar()) to drop ~10 federal holidays — the result is close to NYSE but not identical (NYSE has Good Friday off, federal doesn't). For exact NYSE/CME calendars install pandas_market_calendars or exchange_calendars. Ten holidays sounds small until your annualization is off by 4%.
4. Calendar gaps reveal market holidays. index.to_series().diff().dt.days.gt(3) — any consecutive trading days more than 3 apart are a Monday holiday (Sat + Sun + Mon = 3 missing days, so Tuesday's diff to Friday is 4). It's a one-line audit that catches missing data too: a 6-day gap means you're missing a session.
5. Period vs Timestamp matter for monthly logic. A Timestamp is a specific instant; a Period is a span. df["Date"].dt.to_period("M") converts each date to "the month it belongs to" (Period('2024-01', 'M')), which is the right type for monthly aggregations and renames. For day-level operations, stick with Timestamp; for span-level, use Period. Mixing them is the sneakiest bug in time-series code.
#Pandas #Python #Finance #Datetime #Calendar #BusinessDays #DataAnalytics #PythonForFinance #LearnPandas #DataFrame
---
Generated by GoCelesteAI · part of the Pandas for Finance series