Part of Python AI Tutorial Series

Build AI Apps with Python: AI Agent Loop — Chain Multiple Tools | Episode 10

Celest KimCelest Kim

Video: Build AI Apps with Python: AI Agent Loop — Chain Multiple Tools | Episode 10 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page
Test what you've read · interactive walkthrough

Student code: github.com/GoCelesteAI/build-ai-apps-python/tree/main/episode10 One task. Many tool calls. Claude runs the loop.

Episode 9 had Claude make one tool call per question. "Create hello.txt"write_file → done. Useful, but trivial. A real assistant has to handle requests like "create three notes files, then list them, then summarise their contents to disk." That's three or four tool calls in sequence. Some of them depend on the result of an earlier one.

The pattern that makes that possible is the agent loop. You stop making one round-trip per user request and start running a while True: that keeps cycling through call API → execute tool → return result until Claude says "I'm done — here's the final answer." Once you've written this loop once, you've written every AI agent.

The loop, in plain language

The tool-use dance from Episodes 7–9 was always:

  1. Send question + tool list
  2. Claude returns either text (done) or a tool_use block (call something)
  3. If it's a tool call: execute, send result, get final answer

Step 3 was always "get final answer." That's where we hardcoded the assumption of one tool call per question. The agent loop lifts that assumption:

  1. Send question + tool list
  2. Claude returns either text (done) or a tool_use block
  3. If it's a tool call: execute, send result, go back to step 2

That single change — looping back instead of breaking out — is the difference between "function calling" and "an agent." The model can take as many turns as it needs. It can make a tool call, see the result, decide a different tool is needed next, call that, and so on, until it has enough information to write a final answer.

What we're building

The same three file-system tools as Episode 9 — read, write, list — wrapped in an agent(task) function that runs the loop. We'll give it two prompts that require multiple steps:

  • "Create a file called notes.txt with three Python tips, then read it back to verify." — write, then read.
  • "List all files in the workspace, read each one, and write a summary.txt with their contents." — list, then read each, then write.

The second prompt is the interesting one. It's not just chained — it's unbounded at design time. Claude has to list the directory, see how many files there are, decide to read each one, then synthesise a summary. The number of tool calls depends on the directory's contents. We don't know it in advance. The loop handles it.

The loop

def agent(task):
    messages = [{"role": "user", "content": task}]
    step = 1

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=512,
            tools=tools,
            system="You are a helpful file assistant. Respond in plain text. No markdown.",
            messages=messages,
        )

        if response.stop_reason == "tool_use":
            tool_block = next(b for b in response.content if b.type == "tool_use")
            fn = tool_dispatch[tool_block.name]
            result = fn(**tool_block.input)
            print(f"  Step {step}: {tool_block.name}({tool_block.input})")
            print(f"  Result: {result}")

            messages.append({"role": "assistant", "content": response.content})
            messages.append({
                "role": "user",
                "content": [{"type": "tool_result", "tool_use_id": tool_block.id, "content": json.dumps(result)}],
            })
            step += 1
        else:
            print(f"\n  Final answer: {response.content[0].text}\n")
            break

Three things to watch.

messages grows. Each iteration appends two messages: Claude's tool-use response and our tool-result follow-up. Just like the chat memory pattern from Episode 3 — every iteration sends the full history back. Claude sees its own previous tool calls and their results, which is how it decides what to do next.

The exit condition is stop_reason != "tool_use". Claude stops the loop by not calling a tool. When it has enough information to answer, it writes plain text, the if branch falls through to else, and we break.

The step counter is just for logging. It's not part of the protocol — Claude doesn't see it. We track it so the script's output is readable.

Watching it run

:!python % and the first task. Output (illustrative):

Task: Create a file called notes.txt with three Python tips, then read it back to verify.
--------------------------------------------------
  Step 1: write_file({'path': 'notes.txt', 'content': '1. Use list comprehensions...\n2. ...\n3. ...'})
  Result: {'path': 'notes.txt', 'status': 'written', 'size': 154}
  Step 2: read_file({'path': 'notes.txt'})
  Result: {'path': 'notes.txt', 'content': '1. Use list comprehensions...\n2. ...\n3. ...'}

  Final answer: I've created notes.txt with three Python tips and verified its contents...

Two tool calls, one final answer. Notice the order: Claude chose to write first, then read to verify. We didn't prescribe that; the prompt said "create then read back." The model planned the sequence.

The second task is more interesting:

Task: List all files in the workspace, read each one, and write a summary.txt with their contents.
--------------------------------------------------
  Step 1: list_directory({'path': '.'})
  Result: {'path': '.', 'items': ['hello.txt', 'notes.txt'], 'count': 2}
  Step 2: read_file({'path': 'hello.txt'})
  Result: {'path': 'hello.txt', 'content': 'Hello from Claude!'}
  Step 3: read_file({'path': 'notes.txt'})
  Result: {'path': 'notes.txt', 'content': '1. Use list comprehensions...\n2. ...\n3. ...'}
  Step 4: write_file({'path': 'summary.txt', 'content': '...'})
  Result: {'path': 'summary.txt', 'status': 'written', 'size': 198}

  Final answer: I've created summary.txt summarising the contents of all 2 files in the workspace.

Four tool calls. One was list_directory, two were read_file (one per file the listing returned), one was write_file. The plan emerged from the work: Claude needed to list first to know what to read. It needed to read everything before it could write a summary.

That dependency graph — do A to find out what B should be — is the soul of agent behaviour. Without the loop you'd have to hand-write the orchestration. With the loop, Claude orchestrates itself.

The model is the planner

Pause on what just happened. We did not write list -> for-each(read) -> write. We wrote one prompt and a loop. The model figured out the sequence.

This is the architectural shift that makes AI agents valuable in production. You stop encoding business logic in if/else chains and start encoding it in tool descriptions. "This tool reads a file." "This tool writes a file." "This tool lists a directory." The model composes them.

It also means the bottleneck shifts. Code reviews used to focus on logic. Agent reviews focus on tool design and prompts. Are the descriptions clear? Are the tools the right granularity? Is the system prompt giving the model enough framing to make good choices? Most agent bugs are tool/prompt bugs, not loop bugs.

Termination and runaway loops

What if Claude never stops calling tools? In theory, infinite loops are possible. In practice, Claude is good at noticing when it has enough information, but you should still defend against it.

The simple guard is a step counter:

step = 1
MAX_STEPS = 20
while step <= MAX_STEPS:
    ...
    step += 1
else:
    print("Hit step limit. Aborting.")

For our two-prompt demo, 20 is plenty. For an agent talking to a database it might be 50. For a code-writing agent it might be 100. Whatever the number, have one. The cost of a runaway agent is real money — every loop iteration is a paid API call.

Common mistakes

Forgetting to append assistant and user messages. The history must alternate properly. After each tool call, append the assistant's tool-use response and your tool-result. Skip either and Claude loses track.

Re-sending the original system prompt every iteration. Easy to forget. The system prompt isn't part of messages — it's the system parameter on each call. Pass it every time.

Letting tool errors crash the loop. A read_file for a missing file should return {"error": ...}, not raise. Claude will see the error and adapt. We'll formalise this in Episode 11.

No step cap. A buggy tool that always returns the same error can trap Claude in a retry loop. Always have a maximum number of iterations.

Holding state outside messages. Anything that matters for the next decision needs to be in the conversation history. Don't put critical context in Python globals.

What's next

Next episode: error handling. When a tool fails — file not found, invalid argument, division by zero — what should it return, and how should Claude react? Done right, errors become a form of information that makes the agent more robust. Done wrong, the agent crashes or loops forever.

Recap

What we did today. Wrapped the tool-use dance in a while True: loop that keeps going until Claude returns plain text. Watched Claude plan a multi-step task — list, read each, write a summary — without us encoding the sequence. Saw the conversation history grow linearly with each step. Recognised that this loop is the structural definition of an "agent" — every agent in the rest of the series is a variation on it.

You haven't built a smart agent yet. You've built the engine that any smart agent runs on. The remaining work is making the tools good and the system prompt clear.

Next episode: error handling. See you in the next one.

Ready? Take the quiz on the full lesson page →
Test what you've learned. Watch the lesson and try the interactive quiz on the same page.