/ 20 min read

Inside Claude Code: When One Agent Isn't Enough

Multi-agent coordination — how the coordinator pattern orchestrates parallel AI workers and why coordination is harder than it looks. Part 9 of 10.

One Developer vs. a Team

One developer can build a feature. They hold the entire context in their head, know which files they are editing, and never conflict with themselves. But put five developers on the same feature and suddenly you need a project manager, code reviews, branch strategies, and merge conflict resolution. The work did not get five times harder — the coordination did.

The same is true for AI agents. A single Claude Code session can investigate a bug, edit a file, and run tests. But complex tasks — refactoring 40 files, investigating a cross-layer production incident, implementing a feature with full test coverage — benefit from parallel work. And multiple agents operating simultaneously can conflict: two agents editing the same file, contradicting each other’s changes, or duplicating effort.

Claude Code’s coordinator pattern exists to solve exactly this problem.

The Problem Space

Consider: “Fix the authentication bug and add tests for it.” A developer would research the bug, synthesize findings, implement the fix, then verify it. A single agent can do this sequentially. But research is read-only — you could run multiple agents simultaneously, one reading the auth module, another checking test coverage. After implementation, verification is independent work that does not need the implementer’s context.

The challenge is orchestrating this without agents stepping on each other’s toes.

How Claude Code Solves It: The Coordinator Pattern

Architecture Flow

See the diagram above for a visual overview of this flow.

Try the Interactive SimulationFull View →

Claude Code’s solution is a coordinator mode — a special operating mode where the primary Claude instance becomes a project manager rather than an individual contributor. The coordinator does not read files, edit code, or run tests directly. Instead, it spawns worker agents, synthesizes their results, and directs follow-up work.

This mode is activated via the CLAUDE_CODE_COORDINATOR_MODE environment variable and gated behind a feature flag. When enabled, the coordinator gets a specialized system prompt (defined in src/coordinator/coordinatorMode.ts) that rewires its behavior from “do the work” to “delegate the work.”

The Coordinator Workflow

Coordinator Workflow
User Request
Coordinator Analyzes Task
Research (parallel)
Worker: Angle 1
Worker: Angle 2
Worker: Angle 3
Coordinator Synthesizes Findings
Worker: Implementation
Worker: Verification
Tests Pass?
Yes
Coordinator Presents Result
No
Worker: Fix Tests
(loops back to Verification)

The coordinator has a tightly constrained orchestration-only tool set:

  • Agent — spawn a new worker with a self-contained prompt
  • SendMessage — continue an existing worker or send a follow-up to its agent ID
  • TaskStop — halt a running worker (e.g., when the user changes requirements mid-flight)
  • SyntheticOutput — emit structured final output when a schema is required
  • subscribe/unsubscribe_pr_activity — watch GitHub PR events (when available)

That is it. No file reading, no bash execution, no editing. The coordinator is purely an orchestrator.

The Four-Phase Workflow

The system prompt in getCoordinatorSystemPrompt() is remarkably explicit about what the coordinator should and should not do. It defines a four-phase workflow:

PhaseWhoPurpose
ResearchWorkers (parallel)Investigate codebase, find files, understand the problem
SynthesisCoordinatorRead findings, craft implementation specs
ImplementationWorkersMake targeted changes per spec, commit
VerificationWorkersProve changes work

What’s interesting about the synthesis phase is an anti-pattern callout in the system prompt:

Never write “based on your findings” or “based on the research.” These phrases delegate understanding to the worker instead of doing it yourself.

This is a design decision that separates Claude Code’s approach from simpler multi-agent systems. The coordinator must understand what workers found before directing follow-up work. It cannot just pipe output from one agent to another.

Worker Lifecycle

Workers are spawned via the Agent tool (defined in src/tools/AgentTool/AgentTool.tsx). Each worker is an autonomous Claude instance with its own tool set, context window, and execution lifecycle:

Worker Lifecycle
Coordinator
Agent(prompt, desc, type)
Agent Tool
Worker
Read files
Run bash
Edit code
Complete
Notify
task-notification status=completed
Coordinator reads result, synthesizes
Resume
SendMessage(agentId, …)
Worker continues with full context
Final
task-notification status=completed
Coordinator presents final result

The coordinator decides whether to continue a worker or spawn fresh based on context overlap. High overlap (research found the exact files to edit) means continue. Low overlap (broad research, narrow implementation) means spawn fresh. Verification always spawns fresh — the verifier should see code without implementation assumptions. These heuristics are encoded directly in the coordinator’s system prompt rather than in code.

The Task System

Underneath the Agent and SendMessage tools sits a task state machine defined in src/tasks/types.ts. The system tracks seven distinct task types:

Task States
Lifecycle
Pending
Running
Completed
Failed
Killed

Killed/Completed → Running via SendMessage (resumes/continues)

7 Task Types (shared lifecycle)
DreamTask
LocalShellTask
LocalAgentTask
RemoteAgentTask
InProcessTeammateTask
LocalWorkflowTask
MonitorMcpTask

I found the TaskState union type particularly clean:

export type TaskState =
  | LocalShellTaskState
  | LocalAgentTaskState
  | RemoteAgentTaskState
  | InProcessTeammateTaskState
  | LocalWorkflowTaskState
  | MonitorMcpTaskState
  | DreamTaskState

LocalAgentTask is the primary worker type — a Claude subprocess the coordinator spawns via Agent. LocalShellTask handles background shell commands. RemoteAgentTask runs in remote environments. InProcessTeammateTask powers the swarm model with in-process agents communicating via mailbox-style SendMessage. LocalWorkflowTask and MonitorMcpTask handle workflow orchestration and MCP monitoring. DreamTask is experimental background processing.

The Scratchpad: Shared Knowledge Across Workers

One of the more subtle coordination mechanisms is the scratchpad directory. Gated behind a feature flag, the scratchpad provides a shared filesystem location where workers can read and write without permission prompts:

if (scratchpadDir && isScratchpadGateEnabled()) {
  content += `\nScratchpad directory: ${scratchpadDir}\n` +
    `Workers can read and write here without permission prompts. ` +
    `Use this for durable cross-worker knowledge — structure files ` +
    `however fits the work.`
}

Smart Pattern: This is the closest the current system gets to shared state between workers. It is pragmatic — just a directory that all workers can access. The coordinator tells workers about it in their context, and they can use it to leave notes, intermediate results, or coordination artifacts for other workers.

There is no file locking, no notification when another worker writes to the scratchpad, and no structured way to query what is there. It works because the coordinator serializes write-heavy tasks and parallelizes read-only ones — but it relies on the coordinator making the right scheduling decisions.

Concurrency in Practice

The coordinator’s system prompt declares: “Parallelism is your superpower.” Multiple Agent tool calls in a single message execute concurrently. Three research agents in one message run simultaneously; three in separate messages run sequentially.

The rules: read-only tasks parallel freely, write tasks serialize per file set, verification waits until implementation completes. The fork subagent feature adds another dimension — forks inherit the parent’s full context and share its prompt cache, making them cheaper than fresh agents for research.

Worker Tool Sets

The coordinator knows what tools its workers have. The getCoordinatorUserContext() function builds a tool list and injects it into the coordinator’s context:

const workerTools = isEnvTruthy(process.env.CLAUDE_CODE_SIMPLE)
  ? [BASH_TOOL_NAME, FILE_READ_TOOL_NAME, FILE_EDIT_TOOL_NAME]
      .sort().join(', ')
  : Array.from(ASYNC_AGENT_ALLOWED_TOOLS)
      .filter(name => !INTERNAL_WORKER_TOOLS.has(name))
      .sort().join(', ')

In simple mode, workers get only Bash, Read, and Edit. In full mode, they get the complete async agent tool set minus internal coordination tools. Workers never get the coordinator’s tools — they cannot spawn their own sub-workers or send messages back to the coordinator. Communication is one-way: worker completes, notification arrives.

This asymmetry is intentional. It prevents recursive spawning (a worker spawning workers spawning workers) and keeps the coordination graph flat: one coordinator, N workers, no hierarchy below that.

What the Current Architecture Gets Right

Phase-based workflow. The research-synthesis-implementation-verification pipeline maps naturally to how experienced developers work: understand the problem before writing code, and verify the solution independently.

Parallelism for reads, serialization for writes. The coordinator’s system prompt is explicit: read-only tasks run in parallel freely, write-heavy tasks serialize. This is essentially a readers-writer lock pattern, enforced by prompt engineering rather than by code.

Self-contained worker prompts. The requirement that every worker prompt be self-contained — “Brief the agent like a smart colleague who just walked into the room” — prevents a common failure mode in multi-agent systems where agents share implicit context that eventually drifts out of sync.

Worker tool awareness. The coordinator knows what its workers can do. If MCP servers are connected, those tools are included too. This helps the coordinator write better prompts.

Where It Falls Short

An Improved Coordination Protocol

Improved Coordination Protocol
User Request
Coordinator
Research (parallel)
Worker 1
Worker 2
Write findings → KV Store
Worker 3: Implementation
Acquire lock → Edit → Release lock → Log changes
Worker 4: Verification
Pass / Fail
Fail → Retry with Event Log context
Shared State Store
File Lock RegistryPrevents concurrent edits
Key-Value StoreWorker findings
Event LogCross-worker audit trail

The diagram above illustrates what a more robust coordination protocol could look like. Here is what is currently missing:

⚠️ Watch Out: Two workers can edit the same file. There is no file locking mechanism, no conflict detection, and no merge resolution. The coordinator’s prompt says to serialize write tasks, but this is enforced by LLM judgment, not by code. A confused coordinator could launch two implementation workers that collide.

Opaque worker results. Worker results are opaque to the coordinator until completion. There is no streaming progress, no intermediate checkpoints, no way for the coordinator to course-correct mid-flight. The task-notification arrives as a single XML blob after the worker finishes. If a worker spends 2 minutes going down the wrong path, the coordinator will not know until those 2 minutes are gone.

Independent error handling. Each task type handles errors independently. LocalAgentTask has its own failAsyncAgent() path. InProcessTeammateTask uses abort controllers. RemoteAgentTask has remote-specific error handling. There is no unified failure recovery pattern — no circuit breaker, no retry budget, no escalation protocol that works across task types.

No cross-worker event log. The scratchpad provides shared storage but not shared awareness. There is no event log that records “Worker A edited file X at time T” or “Worker B’s tests failed because of change Y.” Each worker operates in its own bubble, and the coordinator is the only entity with a holistic view — but only after each worker completes and reports.

The SendMessage Tool: Two Roles

SendMessage serves double duty. In coordinator mode, it continues a previously spawned local agent with full context preserved. In the swarm model, it is the communication backbone: sending to teammates by name, broadcasting to all (to: "*"), handling structured protocol messages (shutdown requests, plan approvals), and routing across sessions via UDS sockets or Remote Control bridges.

The routing priority reveals the design: first check for a local in-process agent (via agentNameRegistry), attempt auto-resume if stopped, then fall through to the team mailbox. Same tool, different interaction patterns depending on mode.

The Takeaway

Multi-agent coordination is fundamentally a distributed systems problem. You need consensus on who owns what (file locking), visibility into what others are doing (event logs), and graceful recovery when things go wrong (unified error handling).

Claude Code’s coordinator pattern solves the most important part of this problem: it provides a clear authority (the coordinator) that decomposes work, enforces phase ordering, and synthesizes results. The phase-based workflow — research, synthesis, implementation, verification — maps well to real development practice. And the concurrency model — parallel reads, serialized writes — is a pragmatic application of readers-writer semantics.

What it does not yet have is the infrastructure for mechanical coordination: file locks, conflict detection, cross-worker event logs, and unified failure recovery. These are enforced by prompt engineering rather than by code, which works remarkably well in practice but offers no hard guarantees.

The path forward is clear: move coordination invariants from prompts into code. Detect file conflicts at the tool level. Build a shared event log that all workers can query. Implement retry budgets and circuit breakers that work across task types. The coordinator pattern provides the right architecture — it just needs harder enforcement of the rules it already articulates so well.

Multi-agent systems need explicit coordination protocols — parallelism for reads, serialization for writes, and a shared understanding of who owns what.


This is Part 9 of the “Inside Claude Code” series.

← Part 8: React in Your Terminal | Part 10: Building Something Better →