SQLAlchemy ORM (Models, Relationships, Aggregates) - Python Tutorial for Beginners #36
Video: SQLAlchemy ORM (Models, Relationships, Aggregates) - Python Tutorial for Beginners #36 by Taught by Celeste AI - AI Coding Coach
Python SQLAlchemy ORM: Map Classes to Tables
pip install sqlalchemy. Define classes; SQLAlchemy creates tables.session.add(obj)to insert,session.query(Cls)to read, attribute changes auto-tracked.relationshipfor foreign-key navigation. Works with SQLite, Postgres, MySQL — same code.
An ORM (Object-Relational Mapper) lets you work with database rows as Python objects. Define a class once; SQLAlchemy handles the SQL.
The basics: define, create, query
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.orm import DeclarativeBase, sessionmaker
engine = create_engine("sqlite:///contacts.db", echo=False)
class Base(DeclarativeBase):
pass
class Contact(Base):
__tablename__ = "contacts"
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
email = Column(String, nullable=False)
phone = Column(String)
def __repr__(self):
return f"Contact(id={self.id}, name={self.name!r})"
# Create the table
Base.metadata.create_all(engine)
create_engine(url) connects to the DB. URLs:
sqlite:///path.db— SQLite file (3 slashes for relative path).sqlite:///:memory:— in-memory.postgresql://user:pass@host/dbnamemysql+pymysql://user:pass@host/dbname
echo=True logs every SQL statement — useful for learning and debugging.
DeclarativeBase is the parent of your model classes. The class definition becomes a table.
Sessions: the unit of work
Session = sessionmaker(bind=engine)
session = Session()
alice = Contact(name="Alice Smith", email="alice@example.com", phone="555-0101")
session.add(alice)
session.commit()
print(alice.id) # populated after commit — auto-generated
A Session is the conversation with the database. You add new objects, modify existing ones, delete others; commit() writes everything to the DB in one transaction.
After commit, alice.id has the auto-generated value.
Bulk insert
session.add_all([
Contact(name="Bob", email="bob@example.com", phone="555-0102"),
Contact(name="Charlie", email="charlie@example.com"),
Contact(name="Diana", email="diana@example.com"),
])
session.commit()
add_all queues many objects. commit flushes them in one transaction.
Querying
# All
contacts = session.query(Contact).all()
for c in contacts:
print(c.name, c.email)
# By primary key
alice = session.get(Contact, 1)
# Filter
result = session.query(Contact).filter(Contact.email.like("%example.com")).all()
result = session.query(Contact).filter_by(name="Alice Smith").first()
# Order
sorted_contacts = session.query(Contact).order_by(Contact.name).all()
# Limit
top_three = session.query(Contact).limit(3).all()
# Count
n = session.query(Contact).count()
# Aggregate
from sqlalchemy import func
total = session.query(func.count(Contact.id)).scalar()
filter_by(name=...) for keyword-style equality. filter(Contact.name == ...) for general expressions (>, <, like, in_, etc.).
first() returns the first match or None. one() raises if not exactly one. all() returns a list.
Modern 2.0-style query
SQLAlchemy 2.0 prefers a select() API:
from sqlalchemy import select
stmt = select(Contact).where(Contact.email.like("%example.com")).order_by(Contact.name)
contacts = session.scalars(stmt).all()
session.query(...) still works (it's the legacy API). For new code, select(...) is recommended.
Update and delete
# Update by attribute change
alice.phone = "555-9999"
session.commit()
# Update bulk
session.query(Contact).filter(Contact.email.like("%@old.com")).update(
{"email": "new@example.com"}
)
session.commit()
# Delete
session.delete(alice)
session.commit()
The ORM tracks changes automatically — assign to an attribute, commit, done.
Relationships
from sqlalchemy import ForeignKey
from sqlalchemy.orm import relationship
class Author(Base):
__tablename__ = "authors"
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
books = relationship("Book", back_populates="author")
class Book(Base):
__tablename__ = "books"
id = Column(Integer, primary_key=True)
title = Column(String, nullable=False)
year = Column(Integer)
author_id = Column(Integer, ForeignKey("authors.id"))
author = relationship("Author", back_populates="books")
ForeignKey("authors.id") declares the FK column.
relationship(...) adds Python-side navigation:
author.books— the books for an author.book.author— the author of a book.
back_populates="..." keeps both sides in sync — modifying one automatically reflects on the other.
Working with relationships
tolkien = Author(name="J.R.R. Tolkien")
tolkien.books = [
Book(title="The Hobbit", year=1937),
Book(title="The Fellowship of the Ring", year=1954),
]
session.add(tolkien)
session.commit()
Adding tolkien cascades — books are inserted too. Their author_id is set automatically.
hobbit = session.query(Book).filter_by(title="The Hobbit").first()
print(hobbit.author.name) # 'J.R.R. Tolkien' — lazy-loaded
# Add another book
tolkien.books.append(Book(title="The Silmarillion", year=1977))
session.commit()
hobbit.author triggers a lazy query to fetch the author. Convenient — sometimes a perf trap (see N+1 below).
Lazy loading and N+1
authors = session.query(Author).all()
for a in authors:
print(a.name, len(a.books)) # one query per author!
For each author, lazy-loading runs a separate query for a.books. With 100 authors → 101 total queries.
Fix with eager loading:
from sqlalchemy.orm import joinedload
authors = session.query(Author).options(joinedload(Author.books)).all()
for a in authors:
print(a.name, len(a.books)) # one query total (with JOIN)
joinedload does a SQL JOIN; selectinload does a separate SELECT IN. Both fix N+1.
For new code:
stmt = select(Author).options(joinedload(Author.books))
authors = session.scalars(stmt).unique().all()
Many-to-many
from sqlalchemy import Table
book_tag = Table(
"book_tag", Base.metadata,
Column("book_id", ForeignKey("books.id"), primary_key=True),
Column("tag_id", ForeignKey("tags.id"), primary_key=True),
)
class Book(Base):
...
tags = relationship("Tag", secondary=book_tag, back_populates="books")
class Tag(Base):
__tablename__ = "tags"
id = Column(Integer, primary_key=True)
name = Column(String, unique=True)
books = relationship("Book", secondary=book_tag, back_populates="tags")
The link table goes via secondary=. Now you can do:
hobbit.tags.append(Tag(name="fantasy"))
session.commit()
Type annotations (modern 2.0)
from sqlalchemy.orm import Mapped, mapped_column
class Author(Base):
__tablename__ = "authors"
id: Mapped[int] = mapped_column(primary_key=True)
name: Mapped[str]
books: Mapped[list["Book"]] = relationship(back_populates="author")
Mapped[type] integrates with type checkers. Cleaner than the legacy Column(...) style. SQLAlchemy 2.0+.
When to use an ORM vs raw SQL
ORM wins:
- CRUD on simple models — much less boilerplate.
- Type-aware: refactoring Python class names propagates.
- Testable: in-memory SQLite for fast tests.
- Cross-database: same code on SQLite, Postgres, MySQL.
Raw SQL wins:
- Complex analytical queries (window functions, CTEs).
- Bulk operations (millions of rows).
- Performance-critical paths.
- When you genuinely know SQL and the ORM gets in the way.
Most projects use both. ORM for normal CRUD; raw SQL via text("...") for the rest:
from sqlalchemy import text
result = session.execute(text("SELECT name FROM authors WHERE id > :n"), {"n": 5})
for row in result:
print(row.name)
Migrations: Alembic
Schema changes over time. Alembic generates migration scripts:
pip install alembic
alembic init migrations
# edit alembic.ini and env.py to point to your database
alembic revision --autogenerate -m "add phone to contacts"
alembic upgrade head
Alembic compares your models to the live schema and generates ALTER TABLE scripts. Essential for production.
A library example
from sqlalchemy import select
from sqlalchemy.orm import joinedload
# All books published after 1950, with their author
stmt = (
select(Book)
.options(joinedload(Book.author))
.where(Book.year > 1950)
.order_by(Book.year)
)
for book in session.scalars(stmt):
print(f"{book.title} ({book.year}) by {book.author.name}")
Reads like English. Behind the scenes, one SQL query with a JOIN.
Common stumbles
Forgetting commit(). Changes aren't persisted. The session tracks them in memory until commit.
Accessing detached objects. After session.close(), lazy-loading attributes raises. Either keep the session open or load eagerly before closing.
N+1 queries. Iterating relationships in a loop without eager loading. Use joinedload or selectinload.
Mixing legacy query() and modern select(). Both work, but pick one for consistency. New code: select().
Schema drift. Editing models without running migrations. The DB and the code disagree. Use Alembic.
Long-running session. Keeping a session for hours leaks memory. One session per request (web app), or per logical unit.
Sharing session across threads. Sessions aren't thread-safe. Use scoped_session or one per thread.
Forgetting back_populates. Without it, modifying one side doesn't update the other in memory until commit + reload.
What's next
Lesson 37: CLI tools — argparse, Click, Rich. Building command-line interfaces in Python.
Recap
Define classes inheriting from Base; columns via Column(Type, ...) (or Mapped[type] in 2.0). Session is the unit of work — add, commit, query. Use select() for new code; query() is legacy. relationship(...) + ForeignKey for navigation between objects. joinedload to avoid N+1. Alembic for migrations. ORM for CRUD; raw SQL via text(...) for analytics.
Next lesson: argparse, Click, Rich.