Giving AI Hands: The Tool System

An AI without tools is like a surgeon who can describe every step of an operation but never picks up a scalpel. It can reason about code, explain algorithms, and draft solutions — but it cannot read a file, run a test, or write a single line to disk. Tools are what turn language models into agents. They are the bridge between “I think you should…” and “Done.”

Claude Code ships with dozens of tools (the exact set varies by feature flags and environment) — from simple file reads to full shell execution. Each tool carries a different risk profile. Reading a file is safe. Running rm -rf / is catastrophic. The system needs to be easy to extend (new tools should take minutes to add) but secure by default (a new contributor should not be able to accidentally ship a tool that skips permission checks).

This post dissects the three-layer architecture that makes that possible: the Tool interface (the contract every tool must satisfy), the buildTool factory (which enforces secure defaults), and the registry pipeline (which assembles, filters, and delivers tools to the AI).

The Problem

An agent platform must answer three hard questions simultaneously:

Capability: How does the AI call external functions with typed inputs and structured outputs?
Safety: How do you prevent a tool from executing dangerous operations without explicit user consent?
Extensibility: How do you add new tools without touching a global configuration or risking regressions in existing ones?

A naive approach — a giant switch statement, or ad-hoc function calls — collapses under the weight of dozens of tools with different permission models, concurrency semantics, and UI rendering requirements. You need a formal contract.

How Claude Code Solves It

Architecture Flow

See the diagram above for a visual overview of this flow.

Try the Interactive SimulationFull View →

Layer 1: The Tool Interface

Every tool in Claude Code implements the Tool type defined in src/Tool.ts. This is a large interface (roughly 40 methods and properties), but the critical ones fall into four categories.

Identity and Schema

readonly name: string
readonly inputSchema: Input          // Zod schema for input validation
outputSchema?: z.ZodType<unknown>    // Optional output schema

Every tool declares its name and a Zod schema for its inputs. The schema is not decorative — it is enforced at runtime before call() ever fires. If Claude sends malformed parameters, the tool rejects them before any side effect occurs.

Execution

call(
  args: z.infer<Input>,
  context: ToolUseContext,
  canUseTool: CanUseToolFn,
  parentMessage: AssistantMessage,
  onProgress?: ToolCallProgress<P>,
): Promise<ToolResult<Output>>

The call method is where the real work happens. It receives validated input, a context object (which provides access to application state, the working directory, and abort signals), and a progress callback for long-running operations.

Safety Declarations

isReadOnly(input: z.infer<Input>): boolean
isConcurrencySafe(input: z.infer<Input>): boolean
isDestructive?(input: z.infer<Input>): boolean
checkPermissions(input: z.infer<Input>, context: ToolUseContext): Promise<PermissionResult>
validateInput?(input: z.infer<Input>, context: ToolUseContext): Promise<ValidationResult>

These methods form the tool’s safety profile. What I found particularly elegant is that they are input-dependent — the same tool can be read-only for one input and destructive for another. BashTool.isReadOnly() returns true for ls -la but false for rm file.txt. This granularity is what makes the permission system practical rather than an annoyance.

Rendering

renderToolUseMessage(input, options): React.ReactNode
renderToolResultMessage?(content, progressMessages, options): React.ReactNode
mapToolResultToToolResultBlockParam(content, toolUseID): ToolResultBlockParam

Every tool controls how it appears in the terminal UI and how its output is serialized back to the API. mapToolResultToToolResultBlockParam is the critical bridge — it converts the tool’s typed output into the ToolResultBlockParam that Anthropic’s API expects, feeding the result back into Claude’s context for the next reasoning step.

Layer 2: The buildTool Factory

With 40+ methods on the Tool interface, implementing every tool from scratch would be brutal. Most tools share the same safe defaults. That is what buildTool() solves.

const TOOL_DEFAULTS = {
  isEnabled: () => true,
  isConcurrencySafe: (_input?: unknown) => false,
  isReadOnly: (_input?: unknown) => false,
  isDestructive: (_input?: unknown) => false,
  checkPermissions: (input, _ctx?) =>
    Promise.resolve({ behavior: 'allow', updatedInput: input }),
  toAutoClassifierInput: (_input?: unknown) => '',
  userFacingName: (_input?: unknown) => '',
}

export function buildTool<D extends AnyToolDef>(def: D): BuiltTool<D> {
  return {
    ...TOOL_DEFAULTS,
    userFacingName: () => def.name,
    ...def,
  } as BuiltTool<D>
}

The factory pattern is simple: spread the restrictive defaults first, then spread the developer’s definition on top. If the developer does not explicitly set isConcurrencySafe, it defaults to false (assume not safe). If they do not set isReadOnly, it defaults to false (assume writes). The developer must opt into permissive behavior.

Smart Pattern: This is the single most important security decision in the tool system. It inverts the common pattern where developers forget to add restrictions. Here, you forget to add permissions and the tool is more locked down, not less.

buildTool Factory Flow

Developer provides (name, schema, execute…)

↓

buildTool() applies TOOL_DEFAULTS first

↓

Developer overrides provided?

YES↓

Dev values win via spread

NO↓

Restrictive defaults remain

↓

Complete Tool Object

↓

Defaults (unless overridden):

isReadOnly = false

isConcurrencySafe = false

isDestructive = false

Layer 3: The Registry Pipeline

Tools are defined individually across src/tools/*/, but they are assembled into the final tool pool through a multi-stage pipeline in src/tools.ts.

Registry Pipeline

getAllBaseTools()

dozens of candidates

→

Feature-flag filter

isEnabled()

→

Deny-rule filter

filterToolsByDenyRules

→

MCP tool merge

assembleToolPool

→

Deduplication

uniqBy name

→

Available to

QueryEngine

Stage 1: getAllBaseTools() — Returns every tool that could exist in the current environment. Feature flags gate conditional tools at import time using dead code elimination.

Stage 2: Feature-flag filter — Each tool’s isEnabled() method is called. Tools that are compile-time present but runtime-disabled (e.g., LSP tool when ENABLE_LSP_TOOL is not set) are removed.

Stage 3: Deny-rule filter — filterToolsByDenyRules() removes tools that the user or organization has blanket-denied via permission rules. A deny rule matching mcp__server strips all tools from that MCP server before the model ever sees them.

Stage 4: MCP tool merge — assembleToolPool() combines built-in tools with MCP (Model Context Protocol) tools from external servers. Built-in tools are sorted alphabetically as a contiguous prefix for prompt-cache stability; MCP tools are appended after.

Stage 5: Deduplication — uniqBy('name') ensures that if an MCP tool shares a name with a built-in tool, the built-in wins. This prevents external servers from shadowing core functionality.

Case Study: GrepTool (Clean and Simple)

GrepTool in src/tools/GrepTool/GrepTool.ts is the archetype of a well-behaved read-only tool. Here is how it uses buildTool:

export const GrepTool = buildTool({
  name: GREP_TOOL_NAME,
  searchHint: 'search file contents with regex (ripgrep)',
  maxResultSizeChars: 20_000,
  strict: true,

  // Safety: explicitly opts into permissive flags
  isConcurrencySafe() { return true },
  isReadOnly() { return true },

  // Validation: checks path exists before execution
  async validateInput({ path }): Promise<ValidationResult> {
    if (path) {
      const absolutePath = expandPath(path)
      // SECURITY: Skip UNC paths to prevent NTLM credential leaks
      if (absolutePath.startsWith('\\\\') || absolutePath.startsWith('//')) {
        return { result: true }
      }
      try {
        await fs.stat(absolutePath)
      } catch (e) {
        if (isENOENT(e)) {
          return { result: false, message: `Path does not exist: ${path}` }
        }
        throw e
      }
    }
    return { result: true }
  },

  // Execution: delegates to ripgrep
  async call({ pattern, path, ... }, { abortController, getAppState }) {
    const results = await ripGrep(args, absolutePath, abortController.signal)
    // ... format and return
  },
} satisfies ToolDef<InputSchema, Output>)

Key observations:

Explicit safety opt-in: isConcurrencySafe() and isReadOnly() both return true. Without these overrides, buildTool would default to false for both — correct from a safety standpoint, but terrible for usability.
Input validation before execution: validateInput checks that the path exists before ripgrep runs. This gives the model a clean error message rather than a cryptic stack trace.
UNC path guard: Even a read-only search tool checks for Windows UNC paths (\\server\share) that could leak NTLM credentials. Defense in depth.

Case Study: BashTool (Security as Architecture)

BashTool is the most complex tool in the system. It can do anything a shell can do, which means its security surface is enormous. The tool is backed by an entire directory of specialized modules:

Module	Responsibility
`bashSecurity.ts`	Command substitution detection, dangerous pattern blocking
`bashPermissions.ts`	Permission rule matching, classifier integration
`readOnlyValidation.ts`	Determines if a command is read-only
`pathValidation.ts`	Validates file path constraints
`sedValidation.ts`	Special handling for sed edit commands
`shouldUseSandbox.ts`	Decides whether to sandbox execution
`destructiveCommandWarning.ts`	Warns on irreversible operations

The buildTool call does not override isReadOnly to true. Instead, it computes it per-input:

export const BashTool = buildTool({
  name: BASH_TOOL_NAME,

  isReadOnly(input) {
    const compoundCommandHasCd = commandHasAnyCd(input.command)
    const result = checkReadOnlyConstraints(input, compoundCommandHasCd)
    return result.behavior === 'allow'
  },

  isConcurrencySafe(input) {
    return this.isReadOnly?.(input) ?? false
  },
  // ...
})

isReadOnly delegates to checkReadOnlyConstraints, which parses the shell command and checks every subcommand against known-safe patterns. isConcurrencySafe is tied directly to isReadOnly — if the command writes, it cannot run concurrently.

The security module (bashSecurity.ts) blocks an extensive list of shell injection vectors:

Command substitution patterns: $(), ${}, $[], process substitution <(), >()
Zsh-specific dangers: zmodload (loads dangerous modules), emulate -c (eval equivalent), ztcp (network exfiltration)
PowerShell comment syntax (<#) as defense-in-depth against future changes
Unicode whitespace, control characters, backslash-escaped operators

This is not a simple blocklist. It is a layered defense with 23+ distinct security check categories, each with a numeric identifier for analytics tracking.

The Execution Lifecycle

When Claude decides to use a tool, the request flows through a strict pipeline before any side effect occurs:

Execution Lifecycle

Claude emits tool_use block

↓

Gate 1: Input Validation (Zod)

INVALID↓

Error to Claude

VALID↓

Gate 2: validateInput() (tool-specific)

FAIL↓

Error to Claude

PASS↓

Gate 3: checkPermissions()

DENIED↓

Permission denied

ASK USER↓

User prompt

DENY → deniedAPPROVE ↓

ALLOWED↓

Execute call()

↓

Result: ToolResult

↓

mapToToolResultBlockParam()

↓

Fed back to Claude API

Three gates stand between Claude’s intent and real-world side effects:

Schema validation (Zod) — Rejects malformed input at the type level.
Semantic validation (validateInput) — Rejects structurally valid but contextually wrong input (file does not exist, path outside project).
Permission check (checkPermissions + global permission rules) — Enforces user consent for write operations, dangerous commands, and first-time tool use.

Only after all three gates pass does call() execute. The result flows back through mapToolResultToToolResultBlockParam, which serializes it into the format Claude’s API expects, completing the loop.

What Could Be Better: Modular Registries

The current registry lives in a single file: src/tools.ts. The getAllBaseTools() function is a long array literal containing all built-in tool entries. This works, but it has scaling problems:

Adding a tool means editing a shared file that every other tool implicitly depends on.
Feature-flag conditionals are scattered through the array as spread expressions.
Circular dependency issues require lazy require() calls for some tools.

Current vs Improved Registry

Current: Single Registry

tools.ts

getAllBaseTools()

↓

Dozens of tool imports in one array

Better: Domain Registries

FileTools

Read, Edit, Write

SearchTools

Grep, Glob, WebSearch

AgentTools

Agent, Task, TeamCreate

MCPTools

ListResources, ReadResource

ShellTools

Bash, PowerShell

→→→→→

assembleToolPool()

A modular approach would group tools by domain (file operations, search, agent orchestration, shell execution, MCP integration). Each domain registry would own its feature flags and conditional imports. assembleToolPool would merge domain registries the same way it currently merges built-in and MCP tools.

The Design Principle

The tool system’s core insight is a security pattern that applies far beyond AI agents:

Default to restrictive, opt into permissive.

buildTool() assumes every tool writes to disk, is not safe for concurrent execution, and is not destructive. The developer must explicitly say “this tool only reads” or “this tool is safe to run in parallel.” If they forget, the tool is locked down — not wide open.

This pattern works because the cost of a false restriction (an unnecessary permission prompt) is annoying but recoverable. The cost of a false permission (executing rm -rf without consent) is catastrophic. Asymmetric risk demands asymmetric defaults.

When building any system where plugins or tools interact with the real world, do not trust contributors to remember security annotations. Build a factory that makes the safe path the default path, and require explicit, visible opt-in for anything permissive. The code reviewer can then grep for isReadOnly() { return true } and ask: “Are you sure this tool never writes?”

This is Part 3 of the “Inside Claude Code” series.

← Part 2: The Query Engine | Part 4: Streaming →

Inside Claude Code: Giving AI Hands with the Tool System