/ 20 min read

Inside Claude Code: What Happens When You Type 'claude'

How Claude Code boots in under 200ms using parallel prefetch, lazy loading, and startup optimization. Part 1 of a 10-part series reverse-engineering Anthropic's AI coding tool.

What Happens When You Type ‘claude’?

The Restaurant Kitchen That Opens Before Sunrise

Walk into a professional kitchen at 5 AM and you’ll find prep cooks already at work. Stocks are simmering. Mise en place is being portioned. The bread oven is preheating. None of this is for a specific order — the first customer won’t arrive for hours. But when that first ticket does come in, the kitchen can fire a complex dish in minutes instead of an hour.

Claude Code works the same way. Before you see a prompt, before you type a single character, a carefully orchestrated startup sequence has already loaded authentication credentials, read enterprise policies, fetched feature flags, parsed settings from five different sources, and registered dozens of tools. And it does all of this in under 200 milliseconds.

I spent a few weeks reverse-engineering the boot sequence, and what I found is a masterclass in startup optimization. This post examines how it works.

The Problem: A Cold Start Is a Dead Product

CLI tools live and die by perceived speed. Research from the Nielsen Norman Group puts the threshold for “instantaneous” at 100ms and the tolerance ceiling at about 1 second. Anything above 500ms feels laggy. Users will reach for a different tool — or worse, add a shell alias that avoids yours entirely.

But Claude Code isn’t a simple utility. Before it can accept your first message, it needs to:

  • Authenticate by reading OAuth tokens from the macOS Keychain (or a legacy API key), which requires spawning a security subprocess.
  • Load enterprise policy via Mobile Device Management (MDM), which on macOS means shelling out to plutil to read managed preferences.
  • Fetch feature flags from GrowthBook so the right capabilities are enabled for your account and environment.
  • Parse settings from a 5-source hierarchy: user settings, project settings, local project settings, CLI flag settings, and managed policy settings (plus schema defaults).
  • Register dozens of tools (file editing, bash execution, web search, MCP servers, etc.) and a very large Commander command surface.
  • Set up telemetry including OpenTelemetry spans, metrics, and log providers — a dependency tree that pulls in ~400KB of SDK code.

If you did all of this sequentially — spawn keychain subprocess, wait, spawn plutil, wait, fetch GrowthBook, wait, parse settings, register tools — you’d easily blow past the 500ms ceiling. On a cold boot or a slow corporate network, you’d be looking at over a second.

Claude Code solves this with three strategies: parallel prefetch, lazy loading, and eager settings resolution.

How Claude Code Solves It

Architecture Flow

See the diagram above for a visual overview of this flow.

Try the Interactive SimulationFull View →

The Critical First 20 Lines

Open src/main.tsx and look at the very top of the file. Before a single library is imported, three things happen:

// src/main.tsx -- lines 1-20 (simplified)
import { profileCheckpoint } from './utils/startupProfiler.js';
profileCheckpoint('main_tsx_entry');

import { startMdmRawRead } from './utils/settings/mdm/rawRead.js';
startMdmRawRead();                  // Fire MDM subprocess at T+0

import { startKeychainPrefetch } from './utils/secureStorage/keychainPrefetch.js';
startKeychainPrefetch();            // Fire keychain reads at T+0

import { feature } from 'bun:bundle'; // Compile-time feature flags
// ... remaining ~130 imports follow

This is the key insight: side effects before imports. The startMdmRawRead() and startKeychainPrefetch() calls spawn subprocesses immediately — before the remaining 130+ import statements are even evaluated. Those imports take roughly 135ms as Bun resolves and evaluates the module graph. By the time imports finish, the MDM and keychain subprocesses have already completed in the background.

What’s interesting here is how the code comments say it plainly: the keychain prefetch “fires both macOS keychain reads (OAuth + legacy API key) in parallel” so that later code doesn’t pay the ~65ms cost of synchronous subprocess spawns.

The Full Boot Timeline

Here’s what the startup sequence looks like end to end:

Boot Timeline

Parallel Prefetch (T+0)
MDM raw read (plutil)
40ms
Keychain prefetch
55ms
Module imports
135ms
Sequential Setup
Eager-load settings
145ms
Commander setup
155ms
Await prefetch
158ms
init() configs/env
190ms
REPL mount
200ms

The prefetch operations (MDM and keychain) run entirely in parallel with module evaluation. By the time the sequential phase begins, they’ve been complete for 80+ milliseconds. The await Promise.all([ensureMdmSettingsLoaded(), ensureKeychainPrefetchCompleted()]) call in the preAction hook resolves nearly instantly because there’s nothing left to wait for.

The Orchestration Sequence

Let’s trace the full sequence of control flow:

Orchestration Sequence

CLI Entry
Parallel Prefetch
startMdmRawRead() + startKeychainPrefetch() fired at T+0
~135ms of module imports (prefetch runs in background)
CLI Entry
Settings Loader
eagerLoadSettings() — settings parsed and cached
CLI Entry
Commander
new CommanderCommand() — register 93+ commands
User types ‘claude’ (default command)
Commander
Prefetch
await Promise.all([mdm, keychain]) — already resolved (nearly free)
Commander
init()
enableConfigs()
applySafeConfigEnvironmentVariables()
applyExtraCACertsFromConfig()
Commander
REPL
Mount interactive REPL — ready for input

I found it particularly clever that eagerLoadSettings() is called before Commander is even constructed. This function parses the --settings and --setting-sources CLI flags early, ensuring the settings hierarchy is fully resolved before any code attempts to read configuration. Without this, a race condition could cause settings to be read before the user’s overrides are loaded.

The Three-Tier Module Loading Strategy

Not everything needs to be loaded at startup. Claude Code uses a three-tier strategy:

Module Loading Strategy

Module Loading Strategy
Eager — Loaded at Boot
main.tsx core imports
Commander and CLI parsing
Settings hierarchy
Auth prefetch results
Lazy — First Use
OpenTelemetry SDK (~400KB)
gRPC transport (~700KB)
Analytics and event logging
Deferred — After REPL
MCP server connections
Skill/plugin discovery
Session history indexing

The telemetry subsystem is a perfect example of lazy loading. The init.ts file contains the comment: “initialize1PEventLogging is dynamically imported to defer OpenTelemetry sdk-logs/resources.” The OpenTelemetry SDK alone accounts for ~400KB of JavaScript. Loading it eagerly would add 50-80ms to startup for a subsystem that isn’t needed until the first API call. By deferring it behind a dynamic import(), that cost is paid later — and only if telemetry is actually enabled.

Compile-time feature flags via bun:bundle take this further. The feature() function is resolved at build time, not runtime. When a feature is disabled, Bun’s bundler eliminates the entire code path — including its imports — from the output:

const coordinatorModeModule = feature('COORDINATOR_MODE')
  ? require('./coordinator/coordinatorMode.js')
  : null;

If COORDINATOR_MODE is off at build time, the require() call and the entire coordinator module tree are dead-code-eliminated. Zero bytes shipped, zero time spent parsing.

What’s Smart About This Design

Parallel prefetch is essentially free performance. The MDM and keychain operations would have taken ~65ms each if run synchronously and sequentially. By starting them before imports, they run in a window that would otherwise be dead time (waiting for module evaluation). The measured savings are 200ms+ on macOS, which is the difference between “instant” and “noticeable.” This is a textbook application of Amdahl’s Law: the sequential portion of startup (module evaluation) dominates wall-clock time, so you overlap as much I/O as possible with it.

Compile-time feature flags eliminate entire code paths. Runtime feature flags — the kind you fetch from LaunchDarkly or GrowthBook — still require the gated code to be parsed and loaded. The branch just isn’t executed, but the module graph is still resolved. Compile-time flags via bun:bundle are fundamentally different: the feature() function is evaluated during the build step, and Bun’s dead code elimination removes unreachable branches entirely from the output bundle. This means disabled features have truly zero cost: no parsing, no evaluation, no memory allocation.

Eager settings resolution prevents ordering bugs. The eagerLoadSettings() call before Commander setup is a subtle but important design choice. CLI tools commonly have a bug where configuration is read before all sources are loaded — especially when settings come from multiple levels (user, project, enterprise, flags). By front-loading settings parsing into a single synchronous call, Claude Code ensures every subsequent read sees the complete, merged configuration.

The preAction hook defers init to command execution. When a user runs claude --help, they don’t need authentication, telemetry, or MCP connections. The preAction hook ensures init() only runs when a command is actually being executed, not when help text is being printed. This matters more than it might seem: init() calls enableConfigs(), applySafeConfigEnvironmentVariables(), and applyExtraCACertsFromConfig() — each of which reads files, merges configuration, and may trigger TLS certificate store updates.

What Could Be Better

main.tsx carries too many responsibilities. At several thousand lines in this snapshot, main.tsx is simultaneously the process entry point, the prefetch orchestrator, the settings pre-loader, and the Commander configuration. This makes it difficult to reason about boot order because the sequence is encoded in file position (line 16 runs before line 20 because of JavaScript evaluation order), not in an explicit dependency graph. A dedicated bootstrap/ module with named phases would make the init contract clearer and easier to test.

Boot order depends on implicit module evaluation order. The correctness of the prefetch strategy relies on the fact that startMdmRawRead() is called on line 16 and the remaining imports begin on line 22. If someone reorders the imports or adds a heavy synchronous import between lines 12 and 20, the prefetch advantage silently degrades. There’s no mechanism — no assertion, no lint rule, no integration test — that enforces “these calls must happen before heavy imports.”

There are no startup performance regression tests. The startup profiler records named checkpoints and defines phase durations, and it can produce a phase-by-phase report. But this instrumentation is diagnostic, not preventive — there’s no CI gate that fails if boot time regresses past a threshold. Performance budgets enforced in CI are the standard solution.

The prefetch pattern doesn’t compose. Adding a fourth parallel prefetch requires editing the top of main.tsx and adding another ensure*Completed() call to the preAction hook. There’s no registry of “things to prefetch” or a declarative way to say “this initialization can run in parallel with imports.” A prefetch registry that collects operations and runs them via Promise.all() would scale better.

If You Were Building This From Scratch

The core insight — start slow operations before you need their results — is universally applicable. But you can make it more maintainable with an explicit initialization framework:

Improved Boot Phases

Phase 1: Prefetch
parallel, no deps
Auth credentials
Enterprise policy
Feature flags
Phase 2: Configure
depends on: prefetch
Merge settings
Apply env vars
Phase 3: Register
depends on: configure
Register tools
Register commands
Phase 4: Connect
depends on: register
MCP servers
Telemetry providers
Phase 5: Ready
depends on: connect

Each phase declares its dependencies explicitly. The boot framework resolves them, runs independent operations in parallel, and guarantees ordering for dependent ones. Adding a new prefetch operation means adding an entry to Phase 1 — no need to carefully position a line of code before other imports.

Key takeaways for your own AI tool harness:

  • Parallel initialization is free performance. Identify what can run concurrently and start it from day one. Don’t wait for a performance crisis to add prefetching.
  • Separate the entry point from the orchestrator. The file that receives control from the OS should do one thing: call the boot framework. All sequencing logic belongs in a dedicated module.
  • Gate startup cost behind actual use. If a user runs --help, don’t initialize auth. If telemetry is disabled, don’t load the SDK. The preAction hook pattern is a simple and effective way to defer work.
  • Test your boot time in CI. Record startup duration as a metric and fail builds that regress past a threshold. Performance is a feature; treat it like one.

This boot phase pattern feeds directly into Post 10’s blueprint, where we’ll assemble a complete initialization framework for a custom AI coding harness.


This is Part 1 of the “Inside Claude Code” series — a deep dive into Anthropic’s AI coding tool, reverse-engineered from its leaked source code.

Next up: Now that the CLI is running and the REPL is mounted, what happens when you actually send a message? In Part 2: How an AI Agent Thinks in a Loop, we’ll trace a single user message through the query engine, the model API call, tool execution, and back — the core loop that makes Claude Code an agent and not just a chatbot.