Build AI Apps with Python: The ReAct Agent Pattern — Think Act Observe | Episode 18
Video: Build AI Apps with Python: The ReAct Agent Pattern — Think Act Observe | Episode 18 by Taught by Celeste AI - AI Coding Coach
Student code: github.com/GoCelesteAI/build-ai-apps-python/tree/main/episode18 One name. Three steps. The pattern most "AI agents" you read about are running.
There's a name for the loop you built in Episodes 10 and 11: it's the ReAct pattern. It comes from a research paper that proposed interleaving reasoning (think) with action (call a tool) and observation (process the tool's result), then repeating. The acronym is "Reason + Act."
You've already got the mechanics. What this episode adds is the language — and a sharper sense of why this loop is the right shape for general-purpose agents.
We'll also tighten one detail. In the previous agent loops, Claude didn't always write why it was making each tool call. ReAct works best when the model produces visible reasoning between actions: "I need the weather in New York. I'll call lookup_weather." That visible thinking helps you debug the agent and helps the model itself stay on plan.
Reason, Act, Observe — what each one is
A ReAct iteration has three stages.
Reason. The model decides what to do next. "I need the weather in two cities and then a comparison. Let me look up New York first." This is internal-but-visible: Claude writes a thought before the action. We display it.
Act. The model takes one action by calling a tool. "Call lookup_weather(city='New York')."
Observe. The action returns a result. "72F, sunny." This goes back into the conversation history.
Then loop. Reason → Act → Observe → Reason again, with the new observation in context. The model uses each observation to refine its plan. The loop ends when reasoning concludes the task is complete and the model writes a final answer.
What we're building
A small ReAct agent with two tools — lookup_weather(city) and calculate(expression) — and two test questions:
- "What is the weather in Tokyo?" — single-step.
- "What is the weather in New York and London? Which city is warmer and by how many degrees?" — multi-step: two lookups and an arithmetic computation.
The second question is the interesting one. The agent has to plan: look up New York, look up London, calculate the difference. Three tool calls in sequence. We don't tell it which tools to call or in what order. ReAct figures that out.
The agent
def agent(question):
messages = [{"role": "user", "content": question}]
step = 1
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=500,
system="You are a helpful assistant. Think step by step. Use the provided tools to find information. When you have the final answer, respond directly without using any tools.",
tools=tools,
messages=messages,
)
if response.stop_reason == "end_turn":
for block in response.content:
if hasattr(block, "text"):
print(f"\nFinal Answer: {block.text}")
break
for block in response.content:
if hasattr(block, "text") and block.text:
print(f"\nStep {step} — Think: {block.text}")
if block.type == "tool_use":
tool_name = block.name
tool_input = block.input
print(f"Step {step} — Act: {tool_name}({tool_input})")
result = run_tool(tool_name, tool_input)
print(f"Step {step} — Observe: {result}")
messages.append({"role": "assistant", "content": response.content})
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
}],
})
step += 1
Same loop as Episode 10, with three differences in the printing:
- We print every text block as
Think:. Claude often writes a short reasoning sentence alongside the tool call (or before, in a separate block). - We label tool calls as
Act:. - We label results as
Observe:.
The labels are decorative but they make the agent's behaviour legible. When something goes wrong, you can read the trace and see exactly where the model lost the plot.
The system prompt that enables ReAct
system="You are a helpful assistant. Think step by step. Use the provided tools to find information. When you have the final answer, respond directly without using any tools."
Two important phrases.
"Think step by step." This is the magic phrase. It encourages Claude to write its reasoning out loud rather than jumping straight to a tool call. The "out loud" reasoning isn't just for our benefit — research and our own observation both show that models perform better on multi-step tasks when they write the reasoning. The act of producing reasoning tokens shapes what the model does next.
"When you have the final answer, respond directly without using any tools." This tells the model the exit condition. Without it, an agent might keep calling tools out of confusion. With it, Claude knows to stop the loop by writing prose.
This system prompt is a template. You'll write variations of it for every agent in the rest of the series.
Watching it run
Single-step:
Question: What is the weather in Tokyo?
==================================================
Step 1 — Think: I'll look up the current weather in Tokyo.
Step 1 — Act: lookup_weather({'city': 'Tokyo'})
Step 1 — Observe: 80F, humid
Final Answer: The weather in Tokyo is 80F and humid.
One iteration. Think → Act → Observe → Final Answer. The agent reasoned (briefly), called the tool, and answered.
Multi-step:
Question: What is the weather in New York and London? Which city is warmer and by how many degrees?
==================================================
Step 1 — Think: Let me look up the weather in both New York and London.
Step 1 — Act: lookup_weather({'city': 'New York'})
Step 1 — Observe: 72F, sunny
Step 2 — Think: Now London.
Step 2 — Act: lookup_weather({'city': 'London'})
Step 2 — Observe: 58F, cloudy
Step 3 — Think: New York is 72F, London is 58F. The difference is 72 - 58.
Step 3 — Act: calculate({'expression': '72 - 58'})
Step 3 — Observe: 14
Final Answer: New York is 72F and sunny while London is 58F and cloudy. New York is 14 degrees warmer than London.
Three iterations. The agent decomposed the question without being told. Look up A. Look up B. Compute the difference. Combine into an answer.
This is what people mean when they say "AI agent." A model that, given a high-level goal and a set of capabilities, decides what to do step by step.
ReAct in production
Almost every general-purpose AI agent uses some flavour of ReAct. Read this email, draft a reply, send it. Look at the email (act/observe). Plan the reply (reason). Draft (act). Send (act). Done.
The loop's strength is its generality. It doesn't care what the tools are. File system, web search, database, internal APIs — same loop, different tools. The model's job is the same: think about the next step, take it, observe the result, repeat.
The loop's weakness is also its generality. There's no built-in plan the model commits to ahead of time, so a long task can drift. Mid-loop, the model might forget an earlier sub-goal or invent a new one. Production agents add structure: explicit planning steps before execution, sub-task tracking, periodic re-grounding. We won't go that deep in this series, but be aware: ReAct is a good default and a starting point, not a final answer.
The eval() problem
Look at this tool:
def calculate(expression):
try:
return str(eval(expression))
except Exception as e:
return f"Error: {e}"
eval() runs arbitrary Python. If the model passes expression="__import__('os').system('rm -rf /')", you have a very bad day. For a tutorial we tolerate this; in production, never use eval() for user-or-model-controlled input. Use a real expression parser (the ast module's literal_eval, or a library like sympy or numexpr).
Tool design has to assume the model can pass anything. Even when the model behaves well, the schema and the implementation should make misuse impossible. We covered this from a different angle in Episode 11; it returns here because giving the model a calculate tool is exactly the moment people accidentally turn eval loose.
Common mistakes
No "think step by step" in the system prompt. The agent jumps to actions without visible reasoning, which works for simple tasks but fails on multi-step ones. Add the phrase.
Step counter without a max. Always have a hard cap on iterations. ReAct can loop indefinitely on a buggy tool.
Treating Think and Act as separate API calls. They're not. One API call returns multiple content blocks. The model decides whether to write a thought, call a tool, or both.
Ignoring the thought blocks. The reasoning is information. Print it. Log it. It tells you whether the model is on the right track.
Mixing ReAct with stateless RAG without context. If your agent has both tools and RAG, give it instructions about when to retrieve vs. when to call a tool. Otherwise it'll get confused.
What's next
Next episode: agent with memory. ReAct gives the agent reasoning. Memory gives it persistence. We'll add tools that read and write JSON files, so the agent can save research notes during one session and recall them in a later one. The whole script can be run multiple times and the agent gets smarter each time.
Recap
What we did today. Wrote a ReAct agent — the same agent loop from Episode 10, now framed explicitly as Think / Act / Observe. Used a system prompt that asked Claude to reason step by step and to stop using tools when the answer is ready. Watched the agent solve a single-step question (one tool call) and a multi-step question (three tool calls planned and executed without instruction).
You haven't built a new agent. You've named the loop, sharpened the prompt, and made the agent's reasoning visible. From now on you'll see the ReAct shape everywhere.
Next episode: agent with memory. See you in the next one.