Part of Python AI Tutorial Series

Build AI Apps with Python: File System Tools — AI That Reads and Writes Files | Episode 9

Celest Kim

•April 18, 2026

Video: Build AI Apps with Python: File System Tools — AI That Reads and Writes Files | Episode 9 by Taught by Celeste AI - AI Coding Coach

Take the quiz on the full lesson page

Test what you've read · interactive walkthrough

Student code: github.com/GoCelesteAI/build-ai-apps-python/tree/main/episode09 Three tools — read, write, list. The first agent that affects the world outside the script.

The last two episodes had Claude calling tools that returned mock data. A pretend exchange rate. A pretend weather report. The function ran, but nothing in your computer changed.

Today the tools start doing things. We give Claude three new functions — read a file, write a file, list a directory — and the model can use them to manipulate real files on disk. The same tool-use loop. The same dispatch pattern. Different category of capability.

This is the line between "AI that talks" and "AI that operates." Once a model can read and write files, it can write code, save its own outputs, organise a project, reference earlier work. Most useful AI agents from here on are some combination of file tools, network tools, and a planning loop. We're building the first of those layers.

What we're building

Three tools, scoped to a ./workspace directory:

read_file(path) — return the contents of a file
write_file(path, content) — create or overwrite a file
list_directory(path) — list the items in a directory

Three test prompts:

"Create a file called hello.txt with the text: Hello from Claude!" — exercises write_file.
"What files are in the workspace?" — exercises list_directory.
"Read the contents of hello.txt" — exercises read_file.

By the end you'll have a ./workspace/hello.txt on your real disk, written by Claude, read back by Claude, listed by Claude.

The workspace boundary

WORK_DIR = "./workspace"
os.makedirs(WORK_DIR, exist_ok=True)

def read_file(path):
    full_path = os.path.join(WORK_DIR, path)
    ...

Notice that every path Claude provides is joined onto WORK_DIR. We never let the model give us an absolute path or read from anywhere else on the disk. The reason is obvious as soon as you write the script: a model with unconstrained file-system access is a very expensive way to reformat your home directory.

This isn't bulletproof — path = "../../etc/passwd" would still escape the directory in this naive form, and we'll harden that in Episode 11 — but the shape of the safety pattern is here. Define a sandbox. Resolve every path inside it. Don't let the model out.

The principle generalises. Every "tool that touches reality" you ever build should have an explicit blast-radius limit. File-system tools get a workspace. API tools get rate limits and cost ceilings. Database tools get read-only credentials and row-level filters. The model should not be the safety layer.

The script (the new bits)

The dispatch + loop is identical to Episode 8. The new code is the three tools and their schemas:

WORK_DIR = "./workspace"
os.makedirs(WORK_DIR, exist_ok=True)

def read_file(path):
    full_path = os.path.join(WORK_DIR, path)
    if not os.path.exists(full_path):
        return {"error": f"File not found: {path}"}
    with open(full_path, "r") as f:
        return {"path": path, "content": f.read()}

def write_file(path, content):
    full_path = os.path.join(WORK_DIR, path)
    with open(full_path, "w") as f:
        f.write(content)
    return {"path": path, "status": "written", "size": len(content)}

def list_directory(path="."):
    full_path = os.path.join(WORK_DIR, path)
    if not os.path.exists(full_path):
        return {"error": f"Directory not found: {path}"}
    items = os.listdir(full_path)
    return {"path": path, "items": items, "count": len(items)}

Each tool returns a dict. Each dict either has the success keys (path, content, items) or an error key. Errors as data, not exceptions. Claude can read an error message and react — "the file doesn't exist, let me try a different name" — only if you give it the error in a structured field. If the function raises, your script crashes and Claude never sees it. We'll go deeper on this in Episode 11; for now, just notice the pattern: tools return dicts, errors are values.

Schemas with optional defaults

{
    "name": "list_directory",
    "description": "List files and folders in the workspace directory.",
    "input_schema": {
        "type": "object",
        "properties": {
            "path": {"type": "string", "description": "Directory path relative to workspace"},
        },
        "required": ["path"],
    },
},

A small thing worth noticing: list_directory(path=".") has a default in Python, but the schema marks path as required. Why? Because the schema describes what Claude should send, not what Python accepts. We always want Claude to think about which directory it's listing — the default exists as a convenience, not as something Claude should rely on.

Default values are easy to forget about and lead to silent behavior. Be explicit in the schema about what's required.

Running it

:!python %. Three prompts go through:

Q: Create a file called hello.txt with the text: Hello from Claude!
  Tool: write_file
  Result: {'path': 'hello.txt', 'status': 'written', 'size': 19}
  A: Done — I've created hello.txt with the text "Hello from Claude!" (19 bytes).

Q: What files are in the workspace?
  Tool: list_directory
  Result: {'path': '.', 'items': ['hello.txt'], 'count': 1}
  A: There's one file in the workspace: hello.txt.

Q: Read the contents of hello.txt
  Tool: read_file
  Result: {'path': 'hello.txt', 'content': 'Hello from Claude!'}
  A: The contents of hello.txt are: "Hello from Claude!"

After the script exits, look at the disk:

$ ls workspace/
hello.txt
$ cat workspace/hello.txt
Hello from Claude!

That file is real. Claude wrote it. You can edit it, move it, delete it. The model has done something to your computer's state, intermediated only by the function you wrote.

What this category of tool unlocks

File-system tools are the building block of a surprising number of "AI products":

A code generator that writes itself to disk. Tell the agent to scaffold a Python project; it creates the directory tree and writes each file.
A note-taker. "Save this conversation to a markdown file in /notes."
A workspace assistant. "Read every CSV in /data, find the rows where revenue dropped, write a summary to report.md."
A documentation writer. Read source files, generate docs, write back to disk.

Each is the same loop. Different system prompt, different question, different combination of the same three tools.

Episode 10 turns this into something more dramatic: a loop. Right now we let Claude make one tool call per question. Next episode we let it call a tool, see the result, decide what to do next, call another tool, and continue until done. A single user request can trigger a chain of tool calls. That's the shape of an actual agent.

Common mistakes

Letting the model use absolute paths. Always join onto a sandbox directory. Don't trust the model to stay in scope.

Returning content too large for the context window. A 1MB file inside tool_result.content will eat tokens and may exceed the model's context. Cap the read size, or summarise long contents before returning them.

Crashing the script on missing files. os.path.exists first, return an error dict if not. Letting open() raise FileNotFoundError kills the loop.

Mixing read and write paths. If read_file and write_file use different base directories by accident, you'll write to one place and read from another. Define the base once.

Skipping the workspace creation. os.makedirs(WORK_DIR, exist_ok=True) once at startup. Otherwise the first run fails because the directory doesn't exist.

What's next

Next episode: the agentic loop. Right now we send a question, Claude calls one tool, we return the result, Claude writes a final answer. That's a one-step exchange. A real agent calls multiple tools across multiple steps to accomplish complex tasks: "Create three notes files, then list the directory, then summarise their contents." That requires a loop where Claude keeps calling tools until it's done. We'll build it.

Recap

What we did today. Defined three file-system tools — read, write, list — scoped to a sandbox directory. Returned errors as dict fields rather than letting them raise. Reused the dispatch pattern from Episode 8 and watched Claude pick the right tool for each request. Wrote a real file to a real disk through an AI function call.

You haven't built a coding agent. But you've taken the model from "talks about code" to "writes code to a file you can run." Everything from here is multi-step variations on the same idea.

Next episode: the agent loop. See you in the next one.

Ready? Take the quiz on the full lesson page →

Test what you've learned. Watch the lesson and try the interactive quiz on the same page.

View all episodes in Python AI Tutorial Series →