Part of Pandas for Finance

Pandas for Finance: Pick Tickers and Columns: Brackets, .loc, .iloc, drop, rename

Celest KimCelest Kim

Video: Pick Tickers and Columns: Brackets, .loc, .iloc, drop, rename | Pandas for Finance Ep5 by CelesteAI

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

df[["Date", "Close"]] selects columns. df.loc[...] selects by label, df.iloc[...] by position. .xs("AAPL", level="Ticker") slices a MultiIndex. df.drop(columns=[...]), df.rename(columns={old: new}), df.reindex(columns=[...]) reorder. The SELECT clause for DataFrames.

Filter rows with [boolean] (Ep 4); pick columns with [["a","b"]] or .loc. Two operations, both essential.

Single column vs DataFrame

import pandas as pd
df = pd.read_parquet("data/prices.parquet")

close = df["Close"]            # Series  — 1D
print(type(close))             # <class 'pandas.core.series.Series'>

close_df = df[["Close"]]       # DataFrame  — 2D, one column
print(type(close_df))          # <class 'pandas.core.frame.DataFrame'>

Single brackets get a Series. Double brackets (a list inside) get a DataFrame. The shape difference matters for some methods.

Multiple columns

clean = df[["Date", "Ticker", "Close", "Volume"]]
print(clean.head())

The order in the list is the order in the result. Use this to drop noise — keep only the columns you need.

.loc — label-based

.loc[rows, cols] selects by labels.

df = df.set_index("Date")

# All columns, January 2024
jan = df.loc["2024-01"]

# Specific columns, January 2024
jan_close = df.loc["2024-01", ["Ticker", "Close"]]

# Range of dates and one column
q1_close = df.loc["2024-01":"2024-03", "Close"]

.loc is your tool when: - You have a meaningful index (dates, tickers). - You want both row-filtering and column-selecting in one expression. - You’re assigning back to the DataFrame (avoids SettingWithCopyWarning).

.iloc — position-based

.iloc[rows, cols] selects by integer position (0-indexed).

# First 5 rows, all columns
print(df.iloc[:5])

# Rows 100-105, columns 0-2
print(df.iloc[100:105, :3])

# Last row
print(df.iloc[-1])

Use .iloc when: - You know “I want rows 100-200” — positional, not labeled. - You want the first N or last N regardless of index. - The index is meaningless (default 0, 1, 2, …).

For prices, .loc is usually what you want.

Drop columns

# Remove Open and High
slim = df.drop(columns=["Open", "High"])

# Remove the Adj Close column
no_adj = df.drop(columns="Adj Close")

drop(columns=[...]) returns a new DataFrame with those columns removed. The original is unchanged unless you reassign.

To drop rows (by label):

df.drop(index="2024-01-02")

Less common — usually you filter rows with boolean indexing.

Rename columns

df = df.rename(columns={"Close": "Price", "Volume": "Vol"})

Pass a dict of {old: new}. Only renames the keys you specify.

To rename everything:

df.columns = ["a", "b", "c", "d", "e", "f"]

By assignment. Order matters; length must match.

Reorder columns

df = df[["Date", "Ticker", "Open", "Close", "High", "Low", "Volume"]]

Just re-select with the order you want.

For “move one column to the front”:

cols = df.columns.tolist()
cols.insert(0, cols.pop(cols.index("Ticker")))
df = df[cols]

A bit clunky; usually re-listing is fine.

.xs — slice a MultiIndex

If your DataFrame still has the wide-format MultiIndex columns from yfinance:

import yfinance as yf
df = yf.download(["AAPL", "MSFT"], start="2024-01-01", end="2024-04-01")

# All AAPL fields
aapl = df.xs("AAPL", level=1, axis=1)

# All Close fields across tickers
all_close = df.xs("Close", level=0, axis=1)

level=0 is the field, level=1 is the ticker (in yfinance’s default layout). axis=1 because the MultiIndex is on columns.

For the rest of this series we use long format (one row per Date, Ticker), so .xs is rare. But it’s invaluable when working with raw yfinance output.

Selecting from a Series

A single column is a Series, and you can slice it:

close = df["Close"]

# By position
close.iloc[:5]

# By date label
close.loc["2024-01-02":"2024-01-05"]

Boolean column selection

Select columns whose names match a pattern:

# Columns containing "Close"
df.loc[:, df.columns.str.contains("Close")]

# Numeric columns only
df.select_dtypes(include="number")

# String columns only
df.select_dtypes(include="object")

.select_dtypes() is great for “give me all the float columns to apply some math to.”

Common patterns

Drop and reset

clean = (
  df.drop(columns=["Adj Close"])
    .rename(columns={"Volume": "Vol"})
    .reset_index(drop=True)
)

Method chaining keeps each step short. reset_index(drop=True) for “give me a clean 0,1,2 index, throw away the old one.”

Long → wide pivot (preview)

wide = df.pivot(index="Date", columns="Ticker", values="Close")

A pivot turns long format back into wide. Date as rows, Ticker as columns, Close in cells. Useful for correlation matrices, side-by-side plots.

We cover this in Ep 7.

Subset for a chart

chart_data = (
  df.query("Ticker == 'AAPL'")
    .set_index("Date")["Close"]
    .loc["2024":]
)
chart_data.plot()

One line: filter, set index, pick column, slice dates, plot. The pipeline analysts run all day.

Try it

pick.py:

import pandas as pd

pd.set_option("display.float_format", lambda x: f"{x:,.2f}")

df = pd.read_parquet("data/prices.parquet")

print("=== Subset of columns ===")
print(df[["Date", "Ticker", "Close", "Volume"]].head())

print("\n=== One ticker, two columns ===")
aapl_close = df[df["Ticker"] == "AAPL"][["Date", "Close"]]
print(aapl_close.head())
print(f"Rows: {len(aapl_close)}")

print("\n=== Date-indexed slice ===")
indexed = df[df["Ticker"] == "AAPL"].set_index("Date")
print(indexed.loc["2024-01", ["Close", "Volume"]].head())

print("\n=== Drop and rename ===")
slim = (
  df.drop(columns=["Adj Close"])
    .rename(columns={"Volume": "Vol"})
)
print(slim.columns.tolist())

print("\n=== Float columns only ===")
floats = df.select_dtypes(include="number")
print(floats.dtypes)

Common stumbles

df["Close"] vs df[["Close"]]. Single brackets → Series. Double brackets → DataFrame. Different shapes; different methods available.

.loc with a string slice. df.loc["2024-01":"2024-03"] — both ends inclusive. (.iloc is exclusive on the end, like Python lists.)

drop not modifying. df.drop(columns=[...]) returns a new DataFrame. Reassign: df = df.drop(...).

rename keeping old name. df.rename(...) returns new DataFrame. Reassign or pass inplace=True (deprecated).

select_dtypes for ambiguous types. object covers strings AND mixed types. For string-only, after pandas 2.0 use include="string" if you’ve cast columns to pd.StringDtype().

Reordering columns and forgetting one. df[["a","b","c"]] drops “d” silently. Use df = df.reindex(columns=[...]) if you want to preserve missing columns as NaN.

What’s next

Ep 6: returns. Daily, log, cumulative. The core finance calculation .pct_change().

Recap

df["col"] for one Series; df[["a","b"]] for a DataFrame. .loc[rows, cols] for label-based; .iloc for position. .xs() slices a MultiIndex. df.drop(columns=[...]), df.rename(columns={old: new}), df.reindex(columns=[...]) for reshaping. .select_dtypes(include="number") to grab all numeric columns at once.

Next episode: returns.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.