Part of Python for Beginners

Iterators & Generators (yield, Lazy Evaluation, Data Pipelines) - Python Tutorial for Beginners #24

Sandy LaneSandy Lane

Video: Iterators & Generators (yield, Lazy Evaluation, Data Pipelines) - Python Tutorial for Beginners #24 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

Python Iterators and Generators: iter, next, yield

An iterator is anything with __iter__ and __next__. A generator is a function that uses yield instead of return — it pauses, hands back a value, and resumes on the next call. Generators are lazy: they produce values on demand, not all at once.

When you write for x in something, Python is calling iter(something) and then next() repeatedly. Understanding the protocol lets you build your own.

The iterator protocol

An iterator has two methods:

  • __iter__(self) — returns the iterator (usually return self).
  • __next__(self) — returns the next value, or raises StopIteration when done.
class Countdown:
  def __init__(self, start):
    self.start = start

  def __iter__(self):
    self.current = self.start
    return self

  def __next__(self):
    if self.current < 1:
      raise StopIteration
    value = self.current
    self.current -= 1
    return value

for num in Countdown(5):
  print(num)
# 5 4 3 2 1

The for loop calls iter(Countdown(5)) (which calls __iter__), then next() repeatedly until StopIteration.

Manual iteration

counter = Countdown(3)
it = iter(counter)
print(next(it))    # 3
print(next(it))    # 2
print(next(it))    # 1
print(next(it))    # StopIteration

iter() and next() are how for works under the hood. Useful for "peeking" or stepwise consumption.

Iterables vs iterators

  • Iterable — has __iter__. Can produce an iterator. Lists, tuples, dicts, strings, ranges, files.
  • Iterator — has both __iter__ and __next__. Tracks position; gets exhausted.
nums = [1, 2, 3]    # iterable, not iterator
it = iter(nums)     # iterator
print(next(it))    # 1

# Lists can be iterated multiple times:
for x in nums: print(x)
for x in nums: print(x)   # works again

# Iterators get exhausted:
for x in it: print(x)
for x in it: print(x)   # nothing — already consumed

A list is iterable but not itself an iterator. Each for loop creates a fresh iterator from it.

Generators: yield does it for you

Writing __iter__ + __next__ is verbose. Generators are easier:

def countdown(n):
  while n > 0:
    yield n
    n -= 1

for num in countdown(5):
  print(num)
# 5 4 3 2 1

yield is the magic. Calling countdown(5) doesn't run the function — it returns a generator object. Each next() runs until the next yield, then pauses.

The function's local variables are preserved between yields. When the function returns (or hits the end), StopIteration is raised automatically.

Why generators are cheaper

import sys

nums_list = [x * x for x in range(1000)]
nums_gen = (x * x for x in range(1000))

print(sys.getsizeof(nums_list))   # ~8000 bytes
print(sys.getsizeof(nums_gen))    # ~100 bytes

A list builds all values up front. A generator stores nothing — it computes each value on demand. For a million items, that's MB vs. constant.

For huge sequences or infinite streams, generators are essential.

Fibonacci as a generator

def fibonacci(limit):
  a, b = 0, 1
  while a <= limit:
    yield a
    a, b = b, a + b

fibs = list(fibonacci(100))
print(fibs)   # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]

The state — a, b — lives across yields. The generator pauses with the variables intact.

For an infinite sequence:

def naturals():
  n = 0
  while True:
    yield n
    n += 1

# Take the first 5
from itertools import islice
print(list(islice(naturals(), 5)))   # [0, 1, 2, 3, 4]

Don't list(naturals()) — that runs forever. Always cap with islice, take, or a break condition.

Generator expressions (recap)

gen = (x * 2 for x in range(10))
print(next(gen))   # 0
print(next(gen))   # 2

Same as a list comprehension but with () instead of []. Lazy. Covered in lesson 14.

Pipelines

Generators chain naturally:

def numbers(n):
  for i in range(1, n + 1):
    yield i

def double(nums):
  for n in nums:
    yield n * 2

def keep_even(nums):
  for n in nums:
    if n % 2 == 0:
      yield n

result = list(keep_even(double(numbers(10))))
print(result)
# [4, 8, 12, 16, 20]   (every doubled value is already even)

Each stage processes one value at a time. No intermediate lists. Memory is O(1) regardless of input size.

This is the core idea of streaming pipelines — the same pattern Unix uses with pipes (|).

yield from

def inner():
  yield 1
  yield 2

def outer():
  yield "start"
  yield from inner()
  yield "end"

print(list(outer()))
# ['start', 1, 2, 'end']

yield from inner() is shorthand for "yield each value from this iterable in turn." Equivalent to for x in inner(): yield x.

Useful for delegating to sub-generators or for flattening nested generators.

Sending values into a generator

def echo():
  while True:
    x = yield
    print(f"got {x}")

g = echo()
next(g)             # prime
g.send("hello")     # got hello
g.send(42)          # got 42

g.send(value) resumes the generator with value as the result of the yield expression. This makes generators full coroutines — they can both produce and consume.

In practice, modern Python uses async/await for coroutines (lesson 31). .send() is rarely used directly.

Generator cleanup

def session():
  print("opening")
  try:
    yield "data"
  finally:
    print("closing")

for x in session():
  print(x)
# opening
# data
# closing

If the generator is fully consumed, finally runs. If it's abandoned, Python calls .close() (which raises GeneratorExit inside, allowing finally to clean up).

This pattern is the foundation of contextlib.contextmanager (lesson 25).

itertools: a toolbox

The itertools module has dozens of generator-based utilities:

from itertools import count, cycle, repeat, chain, islice, accumulate, groupby, takewhile, combinations, permutations, product

# Infinite counter
list(islice(count(10), 5))     # [10, 11, 12, 13, 14]

# Cycle through values
list(islice(cycle("ABC"), 7))   # ['A', 'B', 'C', 'A', 'B', 'C', 'A']

# Chain multiple iterables
list(chain([1, 2], [3, 4]))     # [1, 2, 3, 4]

# Running totals
list(accumulate([1, 2, 3, 4]))  # [1, 3, 6, 10]

# Take while predicate true
list(takewhile(lambda x: x < 5, range(10)))   # [0, 1, 2, 3, 4]

# Combinations and permutations
list(combinations([1,2,3], 2))   # [(1,2), (1,3), (2,3)]

Whenever you find yourself writing nested loops or accumulator code, check itertools first.

Common stumbles

Re-iterating an exhausted generator. Once consumed, it's done. Call the function again to get a fresh one.

Forgetting iter(obj) for the iterator. Looping a list directly works (Python calls iter for you), but if you need manual next(), call iter() first.

list() of an infinite generator. Hangs forever. Use islice or a cap.

Using return instead of yield. A function with return is just a function — no generator. Mix-and-match: return exits the generator (raises StopIteration); yield produces a value.

yield inside a list comprehension. Comprehensions can't contain yield. Use a generator function instead.

Side effects in generators. They run at iteration time, not creation time. gen = my_generator() doesn't execute anything yet — surprising if you expect file-opens or DB calls to happen up front.

What's next

Lesson 25: context managers. with statement, __enter__/__exit__, @contextmanager.

Recap

Iterator protocol: __iter__ returns the iterator, __next__ returns the next value or raises StopIteration. Generators are functions that use yield — Python builds the iterator for you. Lazy evaluation: O(1) memory. Pipelines via chained generators. yield from delegates to sub-generators. itertools has reusable building blocks.

Next lesson: context managers.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.