Part of Python for Beginners

Dataclasses (@dataclass, field, frozen, post_init) - Python Tutorial #30

Sandy LaneSandy Lane

Video: Dataclasses (@dataclass, field, frozen, post_init) - Python Tutorial #30 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

Python Dataclasses: @dataclass, field, frozen, post_init

@dataclass generates __init__, __repr__, __eq__ from class annotations. field(default_factory=list) for mutable defaults. frozen=True for immutability. __post_init__ for derived state. The right tool for "data-shaped" classes.

For classes whose main job is to bundle data, writing __init__, __repr__, and __eq__ by hand is tedious. @dataclass does it for you.

A regular class, then a dataclass

# Regular class — verbose
class Point:
  def __init__(self, x: float, y: float):
    self.x = x
    self.y = y

  def __repr__(self):
    return f"Point(x={self.x}, y={self.y})"

  def __eq__(self, other):
    if not isinstance(other, Point):
      return NotImplemented
    return self.x == other.x and self.y == other.y
# Dataclass — same behavior
from dataclasses import dataclass

@dataclass
class Point:
  x: float
  y: float

p = Point(3.0, 4.0)
print(p)              # Point(x=3.0, y=4.0)
print(p == Point(3.0, 4.0))   # True

@dataclass reads the class annotations and generates:

  • __init__ accepting all annotated attributes.
  • __repr__ showing class + fields.
  • __eq__ comparing all fields.

Less code, fewer bugs.

Defaults

@dataclass
class User:
  name: str
  email: str
  age: int = 0
  active: bool = True

u = User("Alice", "alice@example.com")
print(u)
# User(name='Alice', email='alice@example.com', age=0, active=True)

Like regular function parameters, defaults must come after non-defaults.

The mutable default trap

@dataclass
class Team:
  name: str
  members: list = []     # ERROR — same trap as function defaults

# Generates:
# Team("X").members is Team("Y").members  → True

@dataclass catches this and raises ValueError: mutable default <class 'list'> for field members is not allowed.

The fix: field(default_factory=...):

from dataclasses import dataclass, field

@dataclass
class Team:
  name: str
  members: list[str] = field(default_factory=list)

t1 = Team("Backend")
t1.members.append("Alice")
t2 = Team("Frontend")

print(t1.members)    # ['Alice']
print(t2.members)    # []      — separate list

default_factory is called once per instance — fresh list each time.

field() options

field(default=value)            # plain default
field(default_factory=callable) # dynamic default
field(repr=False)               # exclude from __repr__
field(compare=False)            # exclude from __eq__
field(init=False)               # not in __init__
field(metadata={"key": "val"})  # arbitrary metadata

Example:

@dataclass
class Team:
  name: str
  members: list[str] = field(default_factory=list)
  max_size: int = field(default=5, repr=False)

max_size doesn't show in repr. Useful for noisy defaults.

Frozen dataclasses

@dataclass(frozen=True)
class Config:
  host: str
  port: int
  debug: bool = False

cfg = Config("localhost", 8080)
cfg.port = 9090   # FrozenInstanceError

frozen=True makes the dataclass immutable. Setting attributes raises an error.

Frozen dataclasses are also hashable (by default) — you can use them as dict keys or in sets:

configs = {Config("a", 1), Config("b", 2)}

For "value objects" — coordinates, IDs, settings — frozen is the right default.

post_init: derived attributes

@dataclass
class Rectangle:
  width: float
  height: float
  area: float = field(init=False)

  def __post_init__(self):
    self.area = self.width * self.height

r = Rectangle(5.0, 3.0)
print(r.area)    # 15.0

__post_init__ runs after __init__. Use for:

  • Derived attributes (compute from inputs).
  • Validation (raise if invalid).
  • Type conversions.

field(init=False) says "don't take this as an __init__ argument" — needed for derived fields.

Validation in post_init

@dataclass
class Person:
  name: str
  age: int

  def __post_init__(self):
    if self.age < 0:
      raise ValueError(f"Age must be non-negative, got {self.age}")

Common pattern. For more rigorous validation, look at pydantic — full schema validation with similar syntax.

All the @dataclass options

@dataclass(
  init=True,         # generate __init__ (default)
  repr=True,         # generate __repr__ (default)
  eq=True,           # generate __eq__ (default)
  order=False,       # generate __lt__, __le__, __gt__, __ge__
  frozen=False,      # immutable
  unsafe_hash=False, # generate __hash__ even when not frozen
  slots=False,       # use __slots__ (Python 3.10+)
)
class C:
  ...

order=True generates comparison methods based on field order:

@dataclass(order=True)
class Score:
  points: int
  name: str

scores = sorted([Score(80, "Bob"), Score(95, "Alice")])
# [Score(points=80, name='Bob'), Score(points=95, name='Alice')]

slots=True (Python 3.10+)

@dataclass(slots=True)
class Point:
  x: float
  y: float

Generates __slots__ — fixed attribute set, no per-instance __dict__. Smaller memory, slightly faster attribute access. Useful when you create many instances.

Tradeoff: can't add attributes outside the declared set. Usually a non-issue for data classes.

A product catalog

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class Product:
  sku: str
  name: str
  price: float
  tags: list[str] = field(default_factory=list)
  created_at: datetime = field(default_factory=datetime.now)

  def __post_init__(self):
    if self.price < 0:
      raise ValueError("Price cannot be negative")

@dataclass(frozen=True)
class OrderItem:
  product_sku: str
  quantity: int

@dataclass
class Order:
  customer: str
  items: list[OrderItem] = field(default_factory=list)

  @property
  def total_quantity(self) -> int:
    return sum(item.quantity for item in self.items)

Real-world shape — a few classes, validated, immutable where it matters, with derived methods. Compare to writing all the dunder methods manually.

Inheritance

@dataclass
class Animal:
  name: str
  age: int

@dataclass
class Dog(Animal):
  breed: str

d = Dog("Rex", 5, "Lab")
print(d)    # Dog(name='Rex', age=5, breed='Lab')

Subclasses inherit fields. Constructor args are parent fields first, then child fields.

If the parent has defaults, the child's non-default fields cause issues — same rule as regular function defaults. Either give the child fields defaults or use kw_only=True (3.10+).

When to use a dataclass vs alternatives

  • Plain class — when you need significant logic, multiple constructors, complex behavior.
  • NamedTuple — when you need a lightweight immutable record (and accept tuple semantics).
  • TypedDict — when the data is already a dict and you want type checking.
  • pydantic.BaseModel — when you need rigorous validation, JSON schema, parse/serialize.
  • dataclass — the sweet spot: structured data, optional methods, clean syntax.

Common stumbles

Mutable default without field(default_factory=...). members: list = [] raises immediately. Use the factory.

Field after default-field. name: str after age: int = 0 → "non-default argument follows default argument."

@dataclass without annotations. Fields must have type hints, or they're just class attributes (not constructor args). Even Any works as a hint.

Assigning to frozen. cfg.port = 9090 on frozen=True raises. Create a new instance with dataclasses.replace(cfg, port=9090).

Forgetting __post_init__ for validation. Construction succeeds with bad data; problems show up later.

Inheritance default ordering. Child can't have non-default field after parent's default field. Use kw_only=True if you can.

What's next

Lesson 31: async / await. Concurrency for I/O-bound work. asyncio, async def, await.

Recap

@dataclass generates __init__, __repr__, __eq__ from class annotations. Use field(default_factory=list) for mutable defaults. frozen=True for immutable, hashable instances. __post_init__ for derived attributes and validation. order=True for comparison methods. slots=True (3.10+) for memory efficiency. The right tool for "data with a few methods" — between plain classes and pydantic.

Next lesson: async / await.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.