Chapter Six

Building
Agents

Design, wire, and deploy an AI agent
from scratch.

A chatbot waits for you to speak. An agent doesn't. Give it a goal — "research Mars colonization and write a report" — and it breaks the work into steps, picks the right tools, executes them one by one, checks its own results, and keeps going until the job is done. No hand-holding required.

That distinction — between responding to instructions and pursuing a goal — is the difference between a calculator and a coworker. Chatbots answer questions. Agents solve problems.

This chapter is about how agents actually work under the hood — the loop that drives them, the architecture that makes them reliable, and the failure modes that make them dangerous when built carelessly.

The Agent Loop

Every AI agent — from a simple research bot to a complex coding assistant — runs on the same fundamental loop:

Goal. Receive a high-level objective from the human.

Plan. Break the goal into a sequence of concrete steps.

Execute. Run each step, using the right tool for each task.

Observe. Check the result. Did it work? Is the data good?

Iterate. If something's off, adjust the plan and try again.

The key word is loop. An agent doesn't just plan once and execute blindly. It plans, acts, observes the result, and adjusts. The best agents are the ones that recover gracefully when step three goes sideways.

A chatbot is a single turn. An agent is a whole conversation — with itself, its tools, and the world.

Anatomy of an Agent

Under the hood, every agent has five components. You can think of them as the organs of an artificial coworker:

Goal

What is this agent trying to accomplish? Without a clear goal, the agent has no way to decide if it's done. "Research Mars" isn't a goal. "Write a 500-word summary of Mars colonization challenges, cited with sources" is.

Planner

The brain. It takes a goal and decomposes it into a sequence of steps. The best planners create flexible plans that can adapt when a step fails. Bad planners create rigid scripts that break at the first surprise.

Tools

The hands. Search engines, code interpreters, file readers, APIs, calculators — agents don't just think, they act. Each tool has specific inputs and outputs the agent must learn to use correctly.

Memory

The notebook. Agents need to remember what they've done, what they've found, and what went wrong. Without memory, an agent in a loop might repeat the same failing action forever.

Evaluator

The inner critic. After each step, the agent checks: did this work? Is this data reliable? Am I closer to the goal? The evaluator is what separates an agent from a script that just runs and hopes for the best.

Key insight

Most agent failures aren't intelligence failures — they're architecture failures. An agent with great language skills but no evaluator is like a brilliant person who never checks their work.

Agent Blueprint

Pick a goal. Watch an agent plan and execute step by step.

Choose a Goal

Select a goal above to see the agent's plan.

When Agents Fail

Agents don't fail like chatbots. A chatbot gives you a bad answer and waits. An agent gives itself a bad answer, acts on it, then uses the broken result to make the next decision. Failures compound.

There are three classic ways agents go wrong:

Failure Mode 1: The Infinite Loop

The agent gets stuck repeating the same action. A research agent keeps searching for "one more source" and never starts writing. Without a stopping condition, it runs until you pull the plug — or run out of API credits.

Failure Mode 2: Wrong Tool

The agent selects a tool that can't do the job. It tries to web-search for a file on your hard drive, or calls a calculator to summarize an essay. It's not that the agent is stupid — it just picked from its toolbox without checking if the tool fits the task.

Failure Mode 3: Hallucinated Actions

The agent invents tools or APIs that don't exist, then "calls" them with fake results. When the fake call fails, it sometimes fabricates the data it was supposed to get. Double hallucination: fake tool, fake output.

The fix is always the same principle: design for the unhappy path. Max iterations, tool validation, output verification, and human checkpoints. The goal isn't to prevent all failure — it's to make failure recoverable.

Failure Modes Lab

Diagnose buggy agents. Apply the right fix.

Goal:Research renewable energy trends and write a summary report.

Execution Log

Step 1🔍 Search

Search "renewable energy trends 2025"

→ Found 8 articles on solar, wind, and battery storage trends.

Step 2🔍 Search

Search "solar energy market growth"

→ Found 5 more articles. Solar capacity growing 25% YoY.

Step 3🔍 Search

Search "wind energy cost reduction data"

→ Found 6 reports on offshore wind cost trends.

Step 4🔍 Search

Search "battery storage technology advances"

→ Found 4 papers on solid-state batteries.

Step 5🔍 Search

Search "renewable energy policy changes"

→ Found 7 articles on new government incentives.

Step 6🔍 Search

Search "renewable energy investment 2025"

→ Found 3 more investment reports... still searching...

Step 7🔍 Search

Search "green hydrogen production"

→ STUCK: Agent continues researching. Never starts writing.

What went wrong?

The Power of Many

One agent is useful. Multiple agents, working together, can tackle problems that would overwhelm any single system. The trick is specialization.

Think of it like a newsroom. You don't ask the same person to investigate the story, write the article, and edit it for publication. Each role requires a different mindset, different tools, and different standards.

Multi-agent systems work the same way. A Researcher agent gathers information. A Writer agent turns raw notes into polished prose. An Editor agent reviews, fact-checks, and suggests improvements. Each agent has its own system prompt, its own tools, and its own success criteria.

The critical question in multi-agent design isn't "how smart is each agent?" It's "how good is the handoff?" When the Researcher passes notes to the Writer, are those notes structured clearly? When the Writer passes a draft to the Editor, does the Editor have enough context to give useful feedback?

Key insight

The handoff document is the most important artifact in a multi-agent system. A brilliant writer can't save sloppy research notes. The interface between agents is where quality is won or lost.

The Handoff Chain

Three agents, one pipeline. Watch the research flow.

Choose a Topic

Select a topic above to see the handoff chain in action.

The Human in the Loop

The most powerful agent architecture isn't fully autonomous. It's the one that knows when to stop and ask a human.

Think about it in terms of stakes. Summarizing an article? Let the agent run. Sending an email to your boss? Maybe ask first. Deleting files or making a purchase? Definitely ask first. The level of autonomy should match the level of consequences.

Low stakes: Let it run

Searching the web, summarizing text, generating ideas

Medium stakes: Confirm before acting

Sending messages, editing shared documents, making API calls

High stakes: Human decides

Deleting data, spending money, publishing content, contacting people

The best agent builders aren't the ones who make their agents do everything autonomously. They're the ones who design thoughtful checkpoints — moments where the agent pauses, shows its work, and lets a human decide whether to continue.

Key Concepts

Agent Architecture

Goal → Plan → Tools → Memory → Evaluation.

Planning & Decomposition

Good agents break big goals into small tool-using steps.

Error Recovery

Retry, fallback, or ask the human. Designing for the unhappy path.

Multi-Agent Systems

Specialist agents that collaborate: one researches, one writes, one reviews.

An agent is only as good as its architecture. The smartest AI in the world, deployed without guardrails, is just a fast way to make expensive mistakes.

Now you know how agents think, act, fail, and collaborate. In the next chapter, we put all of this into practice with the most powerful coding agent available — Claude Code.

BuildingAgents