List Comprehensions, Dict Comprehensions & Generator Expressions - Python Tutorial #14
Video: List Comprehensions, Dict Comprehensions & Generator Expressions - Python Tutorial #14 by Taught by Celeste AI - AI Coding Coach
Python Comprehensions: list, dict, set, and generator expressions
[expr for x in iterable]— list.{k: v for ... in ...}— dict.{expr for ... in ...}— set.(expr for ... in ...)— generator (lazy). Compact, Pythonic, and faster than equivalent loops.
Comprehensions are Python's signature feature. Once you internalize them, half the loops in your code disappear.
List comprehension basics
squares = [x ** 2 for x in range(6)]
# [0, 1, 4, 9, 16, 25]
doubles = [x * 2 for x in range(1, 6)]
# [2, 4, 6, 8, 10]
[expression for variable in iterable]. Read left-to-right: "the expression, for each variable in the iterable."
The loop equivalent:
squares = []
for x in range(6):
squares.append(x ** 2)
Three lines → one. Less ceremony, fewer chances to typo.
With a condition (filter)
evens = [x for x in range(10) if x % 2 == 0]
# [0, 2, 4, 6, 8]
passing = [s for s in [45, 78, 92, 33, 88] if s >= 60]
# [78, 92, 88]
if filters — keep only values where the condition is true. Goes at the end:
[expression for variable in iterable if condition]
With if/else (transform)
labels = [f"{x} even" if x % 2 == 0 else f"{x} odd" for x in range(6)]
# ['0 even', '1 odd', '2 even', '3 odd', '4 even', '5 odd']
grades = ["pass" if s >= 60 else "fail" for s in [85, 42, 90, 55]]
# ['pass', 'fail', 'pass', 'fail']
Note where the condition goes — this is a different kind of if. Filter if goes at the end. Ternary if/else is part of the expression itself, before the for.
# Filter: [x for x in xs if x > 0] — drops negatives
# Transform: [x if x > 0 else 0 for x in xs] — replaces negatives with 0
Both: [a if c else b for x in xs if d].
Nested loops
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
flat = [x for row in matrix for x in row]
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Multiple for clauses iterate left-to-right (outer to inner):
# Equivalent loop
flat = []
for row in matrix:
for x in row:
flat.append(x)
Nested comprehension
grid = [[i * 3 + j for j in range(3)] for i in range(3)]
# [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
A list comprehension whose expression is itself a list comprehension. Builds a 2D matrix.
These are powerful but get hard to read fast. Three nested levels → use a function.
Dict comprehension
words = ["apple", "banana", "cherry"]
lengths = {w: len(w) for w in words}
# {'apple': 5, 'banana': 6, 'cherry': 6}
{key: value for variable in iterable}. Same syntax as list comprehension, but with key: value and curly braces.
With a condition:
scores = {"Alice": 85, "Bob": 42, "Charlie": 91}
passing = {k: v for k, v in scores.items() if v >= 60}
# {'Alice': 85, 'Charlie': 91}
Iterating .items() gives (key, value) pairs — useful for filtering or transforming an existing dict.
Inverting a dict:
ages = {"Alice": 30, "Bob": 25}
by_age = {age: name for name, age in ages.items()}
# {30: 'Alice', 25: 'Bob'}
(Caveat: if values aren't unique, the inverted dict loses entries.)
Set comprehension
names = ["Alice", "Bob", "Alice", "Charlie", "Bob"]
unique = {n.lower() for n in names}
# {'alice', 'bob', 'charlie'}
Curly braces, but no : — that's what distinguishes it from a dict comprehension. Automatically deduplicates.
For "unique values from this transformation," set comprehension is perfect.
Generator expression
gen = (x ** 2 for x in range(5))
print(type(gen)) # <class 'generator'>
next(gen) # 0
next(gen) # 1
list(gen) # [4, 9, 16] — the rest
Parentheses instead of brackets. Looks like a tuple comprehension, but Python doesn't have those — (...) produces a generator, which is lazy.
Lazy means: nothing is computed until iterated. Memory: O(1) instead of O(n).
# Sum 1 to a million WITHOUT building a million-element list
total = sum(x ** 2 for x in range(1_000_000))
When passing a generator to a function (like sum, max, any), the parentheses can be omitted:
total = sum(x ** 2 for x in range(10)) # parens optional
mx = max(len(s) for s in strings)
For pipelines — sum, min, max, any, all, join — generator expressions are the right choice.
list vs generator: when to choose which
| Use list comprehension when | Use generator expression when |
|---|---|
| You need to iterate multiple times | You only iterate once |
You need indexing (result[5]) |
Used for accumulation (sum, max, etc.) |
| Result is small | Result is huge or infinite |
| You'll filter/sort the result later | Memory matters |
# All in memory: list = [x*x for x in range(10**7)] — 80MB
# One at a time: gen = (x*x for x in range(10**7)) — tiny
Default to list for short results; switch to generator when memory or laziness matters.
A word analyzer
sentences = [
"Python is a great language",
"List comprehensions are powerful",
]
# Flatten all words
words = [w for s in sentences for w in s.split()]
# Word → length dict (deduplicated)
word_lengths = {w: len(w) for w in sorted(set(words))}
# Unique first letters
first_letters = {w[0].upper() for w in words}
# Stats with generator expressions
avg = sum(len(w) for w in words) / len(words)
longest = max(words, key=len)
All four comprehension forms in one example. Each does one transformation, no accumulator variables.
Comprehensions vs map/filter
# These do the same thing:
squared_a = [x ** 2 for x in nums]
squared_b = list(map(lambda x: x ** 2, nums))
passing_a = [s for s in scores if s >= 60]
passing_b = list(filter(lambda s: s >= 60, scores))
Comprehensions are usually more Pythonic. They're also faster — no function-call overhead per element.
map/filter win only when the function is already a built-in or pre-defined name:
prices = list(map(float, ["1.50", "3.99"])) # cleaner than [float(p) for p in prices]
Walrus operator (3.8+)
data = [parse(line) for line in lines if (parsed := parse(line)) is not None]
The walrus := assigns inside an expression — useful when you want to use a computed value in both the condition and the result. Rarely needed in comprehensions; if you reach for it, consider whether a regular loop is clearer.
Performance
import timeit
t1 = timeit.timeit("[x*2 for x in range(1000)]", number=10000)
t2 = timeit.timeit("""
result = []
for x in range(1000):
result.append(x*2)
""", number=10000)
print(t1, t2) # comprehension ~30% faster
Comprehensions are optimized at the bytecode level. Faster than equivalent loops, faster than map+lambda.
For huge data, generators avoid building the intermediate list at all.
Common stumbles
Comprehension that should be a generator. sum([x**2 for x in big]) builds a list first. sum(x**2 for x in big) doesn't. Drop the brackets.
Empty {} is a dict. {x for x in []} is an empty set; {} alone is an empty dict. Use set() for an empty set.
Multiple loops vs nested. [x*y for x in xs for y in ys] is the product — every x with every y. If you wanted parallel iteration, use zip: [x*y for x, y in zip(xs, ys)].
Generator exhaustion. A generator can only be iterated once. After that it's empty. If you need to iterate twice, use a list.
Mutating during iteration. Comprehensions create the iterable up front, so usually safe. But [d.pop(k) for k in d] modifies what you're iterating — RuntimeError. Build a list of keys first: [d.pop(k) for k in list(d)].
Reading too clever. Three-level nested comprehension with conditionals → write a function. Cleverness is not a virtue.
What's next
Lesson 15: file handling. open(), read, write, the with statement.
Recap
Four comprehension flavors: list [expr for x in xs], dict {k: v for ...}, set {expr for ...}, generator (expr for ...) (lazy). Filter with if condition. Transform with expr_a if cond else expr_b. Nested for clauses iterate left-to-right. Prefer over map/filter + lambda. Use generators for huge or one-shot iterations. Skip nesting more than two levels — extract a function.
Next lesson: file handling.