Build AI Apps with Python: Product Review Analyzer with JSON | Episode 5
Video: Build AI Apps with Python: Product Review Analyzer with JSON | Episode 5 by Taught by Celeste AI - AI Coding Coach
Student code: github.com/GoCelesteAI/build-ai-apps-python/tree/main/episode05 A product review goes in. Sentiment, rating, key points, and a summary come out — as JSON.
The first four episodes treated Claude like a writer. Ask a question, get a paragraph. That's useful for chatbots, summaries, and explanations — but most software doesn't want paragraphs. Most software wants data.
A customer-support pipeline that classifies tickets needs category, urgency, language — not a sentence about how the customer seems frustrated. A review aggregator needs sentiment, rating, key complaints — not a free-form opinion. A document parser needs fields and values — not a description of what's in the document.
The Claude API can give you any of those. The trick is that you have to ask in the right way: a precise system prompt that defines a JSON schema, a low temperature, and json.loads() on the response. That's the bridge from "AI that talks" to "AI that drives software."
What we're building
A product review analyzer. Pass in any review — positive, negative, neutral — and get back a Python dict with four fields:
{
"sentiment": "positive",
"rating": 5,
"key_points": ["long battery life", "great screen", "handles demanding workloads"],
"summary": "A high-performing laptop with excellent battery and display."
}
Three reviews, three calls, three structured results. Every call returns the same shape. The downstream code can iterate, filter, count, sort — without ever doing string parsing.
By the end you'll see why this pattern is the workhorse of every "AI feature" you'll ever ship. Most production AI is not chat. Most production AI is classifier.
The script
import os
import json
from dotenv import load_dotenv
from anthropic import Anthropic
load_dotenv()
client = Anthropic()
system_prompt = """You are a product review analyzer.
Respond ONLY with valid JSON in this exact format:
{
"sentiment": "positive" or "negative" or "neutral",
"rating": 1-5,
"key_points": ["point1", "point2", "point3"],
"summary": "one sentence summary"
}
No other text. Just JSON."""
def analyze_review(review):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=256,
temperature=0.0,
system=system_prompt,
messages=[
{"role": "user", "content": review}
],
)
return json.loads(response.content[0].text)
review1 = "This laptop is amazing! The battery lasts all day, the screen is gorgeous, and it runs everything I throw at it. Best purchase I made this year."
print("=== Review 1 ===")
result = analyze_review(review1)
print(json.dumps(result, indent=2))
Three things make this script work: the system prompt, temperature=0.0, and json.loads(). Let's take them in order.
The system prompt as a contract
system_prompt = """You are a product review analyzer.
Respond ONLY with valid JSON in this exact format:
{
"sentiment": "positive" or "negative" or "neutral",
"rating": 1-5,
"key_points": ["point1", "point2", "point3"],
"summary": "one sentence summary"
}
No other text. Just JSON."""
This is not a description. It's a contract.
Every part of it does work:
Respond ONLY with valid JSON — Claude's default behaviour is to be helpful and conversational, which means wrapping things in "Here's the analysis you requested..." prose. We don't want that. We want pure JSON. The capitalised ONLY is doing real work — it's a strong instruction that overrides the default chatty style.
The schema, written as a literal example. We don't describe the shape in English. We show the actual JSON. Claude will copy the structure: same keys, same nesting, same value types. Showing the shape is far more reliable than describing it.
Constrained value types. "sentiment": "positive" or "negative" or "neutral" — three allowed values, listed inline. The model now knows the cardinality of that field. Same with "rating": 1-5. Without these constraints, you'd get "sentiment": "very positive" and "rating": "five stars" because the model would invent reasonable-sounding values.
No other text. Just JSON. — repeated for emphasis. Two instructions saying the same thing. The repetition is intentional; models pay more attention to constraints that show up multiple times.
This kind of system prompt is the heart of any JSON-output workflow. You'll write dozens of them. The pattern: identify the role, name the output format, show the schema, list the constrained values, explicitly forbid extra text.
Why temperature=0.0
temperature=0.0,
temperature controls how random the model's choices are. Higher temperature → more varied, more creative, less predictable. Lower temperature → more deterministic, more conservative, more boring.
For structured output, you want boring. You want the same review to produce the same JSON every time. You want key_points to be three items because you said three items, not three or five depending on the model's mood. You want sentiment to land on one of the three approved values.
temperature=0.0 doesn't make Claude perfectly deterministic — there's still tie-breaking and slight variability — but it pushes hard toward the most-likely-token-each-step path. For data extraction, classification, JSON output, and structured anything: zero is the default. You go higher only when you want variety: brainstorming, creative writing, generating diverse examples.
The default temperature is 1.0. If you don't think about it, you'll get small variations between calls on the same input. For chatbots that's fine. For a classifier that gets called a million times a day, it's a bug.
json.loads() and trusting the contract
return json.loads(response.content[0].text)
Claude returns a string. We parse it as JSON. If the contract held, we get back a Python dict that matches the schema. If it didn't, json.loads() raises a JSONDecodeError.
In production code, you'd wrap this in a try/except and either retry or fall back to a default. For tutorial code, we let it raise — a failure is loud and visible, which is what we want during development.
A robustness pattern worth knowing: sometimes Claude wraps its JSON in markdown fences (json ...) despite the instructions. If you hit this in your own apps, strip the fences before parsing. Or — better — re-prompt with stronger language and a worked example. The cleaner fix is in the system prompt, not in regex on the output.
Running it
:!python %. Three reviews go through. Output:
=== Review 1 ===
{
"sentiment": "positive",
"rating": 5,
"key_points": [
"long battery life",
"gorgeous screen",
"handles demanding workloads"
],
"summary": "A high-performing laptop with excellent battery, display, and processing power."
}
=== Review 2 ===
{
"sentiment": "negative",
"rating": 1,
"key_points": [
"broke after two weeks",
"mediocre sound quality",
"unhelpful customer support"
],
"summary": "Headphones with poor durability, disappointing audio, and bad customer service."
}
=== Review 3 ===
{
"sentiment": "neutral",
"rating": 3,
"key_points": [
"works as expected",
"no standout features",
"fair value for the price"
],
"summary": "A functional but unremarkable coffee maker that performs adequately for the price."
}
Three calls. Three identical shapes. The downstream code that consumes this output can be a clean for review in reviews: ratings.append(analyze_review(review)["rating"]) — no parsing, no fragility.
Where this pattern goes in real apps
Once you can extract structure from text, you can:
- Classify: send a customer email, return
{"category": "billing", "urgency": "high"} - Extract: send an invoice, return
{"vendor": "...", "amount": 1234.56, "due_date": "..."} - Tag: send a blog post, return
{"topics": ["python", "api", "tutorial"], "audience": "beginner"} - Score: send a piece of writing, return
{"clarity": 7, "grammar": 9, "engagement": 5}
Every one of those is the same script with a different system prompt. The Python code, the API call, the parsing — all identical. Only the contract changes.
That's what makes this pattern productive. Once you've written one structured-output script, you can stand up a new classifier in five minutes by writing a new system prompt.
Tool use vs. structured output
Episode 7 onward will introduce tool use — the API's tools parameter, where you describe functions with JSON schemas and Claude either calls them or doesn't. There's overlap with what we did today: tool definitions are also JSON schemas, and tool calls also return structured arguments.
Use the structured-output pattern from this episode when you just want data back. The model's job is one thing: extract or classify and return JSON.
Use tool use when the model has to decide whether to act, which action to pick, or call a function with arguments. The model is a router, not a parser.
Most apps end up using both. Today's pattern is the simpler one — one call, one parse — and it's the one most "AI features" ultimately need.
Common mistakes
Forgetting temperature=0.0. Output drifts between calls. Same input, different JSON. Set the temperature.
Describing the schema in English instead of showing it. "Return a JSON object with a sentiment field which can be positive, negative, or neutral, and a rating field..." — wordier and less reliable than showing the actual shape. Show, don't describe.
Allowing free-form values. "Return a sentiment field describing how the reviewer feels." You'll get every word in the thesaurus. List the allowed values explicitly.
No json.loads() error handling in production. A single malformed response will crash your pipeline. Wrap, retry, or fall back. For tutorial code, we let it raise — but in real apps, expect failures.
Forgetting to constrain output length. max_tokens=256 is enough for our small schema. If your schema is larger and max_tokens is too small, Claude will get cut off mid-JSON and json.loads() will fail. Match the budget to the schema.
What's next
You've now seen six core API behaviours: text generation, system prompts, multi-turn memory, streaming, structured output. That's everything you need to build the front end of an AI app.
Next episode: vision. Sending images to Claude alongside text. Same API, same messages.create(), same response shape — but the message content can now be a list of blocks: an image, then a question. Claude analyses the picture and answers in the language you've been working with all along.
Recap
What we did today. Wrote a system prompt that defined a JSON schema as a contract. Set temperature=0.0 for deterministic output. Built a reusable analyze_review() function that runs client.messages.create() and parses the result with json.loads(). Sent three reviews — positive, negative, neutral — and got back three perfectly structured Python dicts.
You haven't built a review analyzer. You've built the pattern every classifier, extractor, and tagger uses. The system prompt changes. The schema changes. The model, the temperature, the parsing — all reusable.
Next episode: vision. See you in the next one.