Inside Claude Code: What Happens When You Type 'claude'
How Claude Code boots in under 200ms using parallel prefetch, lazy loading, and startup optimization. Part 1 of a 10-part series reverse-engineering Anthropic's AI coding tool.
What Happens When You Type ‘claude’?
The Restaurant Kitchen That Opens Before Sunrise
Walk into a professional kitchen at 5 AM and you’ll find prep cooks already at work. Stocks are simmering. Mise en place is being portioned. The bread oven is preheating. None of this is for a specific order — the first customer won’t arrive for hours. But when that first ticket does come in, the kitchen can fire a complex dish in minutes instead of an hour.
Claude Code works the same way. Before you see a prompt, before you type a single character, a carefully orchestrated startup sequence has already loaded authentication credentials, read enterprise policies, fetched feature flags, parsed settings from five different sources, and registered dozens of tools. And it does all of this in under 200 milliseconds.
I spent a few weeks reverse-engineering the boot sequence, and what I found is a masterclass in startup optimization. This post examines how it works.
The Problem: A Cold Start Is a Dead Product
CLI tools live and die by perceived speed. Research from the Nielsen Norman Group puts the threshold for “instantaneous” at 100ms and the tolerance ceiling at about 1 second. Anything above 500ms feels laggy. Users will reach for a different tool — or worse, add a shell alias that avoids yours entirely.
But Claude Code isn’t a simple utility. Before it can accept your first message, it needs to:
- Authenticate by reading OAuth tokens from the macOS Keychain (or a legacy API key), which requires spawning a
securitysubprocess. - Load enterprise policy via Mobile Device Management (MDM), which on macOS means shelling out to
plutilto read managed preferences. - Fetch feature flags from GrowthBook so the right capabilities are enabled for your account and environment.
- Parse settings from a 5-source hierarchy: user settings, project settings, local project settings, CLI flag settings, and managed policy settings (plus schema defaults).
- Register dozens of tools (file editing, bash execution, web search, MCP servers, etc.) and a very large Commander command surface.
- Set up telemetry including OpenTelemetry spans, metrics, and log providers — a dependency tree that pulls in ~400KB of SDK code.
If you did all of this sequentially — spawn keychain subprocess, wait, spawn plutil, wait, fetch GrowthBook, wait, parse settings, register tools — you’d easily blow past the 500ms ceiling. On a cold boot or a slow corporate network, you’d be looking at over a second.
Claude Code solves this with three strategies: parallel prefetch, lazy loading, and eager settings resolution.
How Claude Code Solves It
The Critical First 20 Lines
Open src/main.tsx and look at the very top of the file. Before a single library is imported, three things happen:
// src/main.tsx -- lines 1-20 (simplified)
import { profileCheckpoint } from './utils/startupProfiler.js';
profileCheckpoint('main_tsx_entry');
import { startMdmRawRead } from './utils/settings/mdm/rawRead.js';
startMdmRawRead(); // Fire MDM subprocess at T+0
import { startKeychainPrefetch } from './utils/secureStorage/keychainPrefetch.js';
startKeychainPrefetch(); // Fire keychain reads at T+0
import { feature } from 'bun:bundle'; // Compile-time feature flags
// ... remaining ~130 imports follow
This is the key insight: side effects before imports. The startMdmRawRead() and startKeychainPrefetch() calls spawn subprocesses immediately — before the remaining 130+ import statements are even evaluated. Those imports take roughly 135ms as Bun resolves and evaluates the module graph. By the time imports finish, the MDM and keychain subprocesses have already completed in the background.
What’s interesting here is how the code comments say it plainly: the keychain prefetch “fires both macOS keychain reads (OAuth + legacy API key) in parallel” so that later code doesn’t pay the ~65ms cost of synchronous subprocess spawns.
The Full Boot Timeline
Here’s what the startup sequence looks like end to end:
Boot Timeline
The prefetch operations (MDM and keychain) run entirely in parallel with module evaluation. By the time the sequential phase begins, they’ve been complete for 80+ milliseconds. The await Promise.all([ensureMdmSettingsLoaded(), ensureKeychainPrefetchCompleted()]) call in the preAction hook resolves nearly instantly because there’s nothing left to wait for.
The Orchestration Sequence
Let’s trace the full sequence of control flow:
Orchestration Sequence
I found it particularly clever that eagerLoadSettings() is called before Commander is even constructed. This function parses the --settings and --setting-sources CLI flags early, ensuring the settings hierarchy is fully resolved before any code attempts to read configuration. Without this, a race condition could cause settings to be read before the user’s overrides are loaded.
The Three-Tier Module Loading Strategy
Not everything needs to be loaded at startup. Claude Code uses a three-tier strategy:
Module Loading Strategy
The telemetry subsystem is a perfect example of lazy loading. The init.ts file contains the comment: “initialize1PEventLogging is dynamically imported to defer OpenTelemetry sdk-logs/resources.” The OpenTelemetry SDK alone accounts for ~400KB of JavaScript. Loading it eagerly would add 50-80ms to startup for a subsystem that isn’t needed until the first API call. By deferring it behind a dynamic import(), that cost is paid later — and only if telemetry is actually enabled.
Compile-time feature flags via bun:bundle take this further. The feature() function is resolved at build time, not runtime. When a feature is disabled, Bun’s bundler eliminates the entire code path — including its imports — from the output:
const coordinatorModeModule = feature('COORDINATOR_MODE')
? require('./coordinator/coordinatorMode.js')
: null;
If COORDINATOR_MODE is off at build time, the require() call and the entire coordinator module tree are dead-code-eliminated. Zero bytes shipped, zero time spent parsing.
What’s Smart About This Design
Parallel prefetch is essentially free performance. The MDM and keychain operations would have taken ~65ms each if run synchronously and sequentially. By starting them before imports, they run in a window that would otherwise be dead time (waiting for module evaluation). The measured savings are 200ms+ on macOS, which is the difference between “instant” and “noticeable.” This is a textbook application of Amdahl’s Law: the sequential portion of startup (module evaluation) dominates wall-clock time, so you overlap as much I/O as possible with it.
Compile-time feature flags eliminate entire code paths. Runtime feature flags — the kind you fetch from LaunchDarkly or GrowthBook — still require the gated code to be parsed and loaded. The branch just isn’t executed, but the module graph is still resolved. Compile-time flags via bun:bundle are fundamentally different: the feature() function is evaluated during the build step, and Bun’s dead code elimination removes unreachable branches entirely from the output bundle. This means disabled features have truly zero cost: no parsing, no evaluation, no memory allocation.
Eager settings resolution prevents ordering bugs. The eagerLoadSettings() call before Commander setup is a subtle but important design choice. CLI tools commonly have a bug where configuration is read before all sources are loaded — especially when settings come from multiple levels (user, project, enterprise, flags). By front-loading settings parsing into a single synchronous call, Claude Code ensures every subsequent read sees the complete, merged configuration.
The preAction hook defers init to command execution. When a user runs claude --help, they don’t need authentication, telemetry, or MCP connections. The preAction hook ensures init() only runs when a command is actually being executed, not when help text is being printed. This matters more than it might seem: init() calls enableConfigs(), applySafeConfigEnvironmentVariables(), and applyExtraCACertsFromConfig() — each of which reads files, merges configuration, and may trigger TLS certificate store updates.
What Could Be Better
main.tsx carries too many responsibilities. At several thousand lines in this snapshot, main.tsx is simultaneously the process entry point, the prefetch orchestrator, the settings pre-loader, and the Commander configuration. This makes it difficult to reason about boot order because the sequence is encoded in file position (line 16 runs before line 20 because of JavaScript evaluation order), not in an explicit dependency graph. A dedicated bootstrap/ module with named phases would make the init contract clearer and easier to test.
Boot order depends on implicit module evaluation order. The correctness of the prefetch strategy relies on the fact that startMdmRawRead() is called on line 16 and the remaining imports begin on line 22. If someone reorders the imports or adds a heavy synchronous import between lines 12 and 20, the prefetch advantage silently degrades. There’s no mechanism — no assertion, no lint rule, no integration test — that enforces “these calls must happen before heavy imports.”
There are no startup performance regression tests. The startup profiler records named checkpoints and defines phase durations, and it can produce a phase-by-phase report. But this instrumentation is diagnostic, not preventive — there’s no CI gate that fails if boot time regresses past a threshold. Performance budgets enforced in CI are the standard solution.
The prefetch pattern doesn’t compose. Adding a fourth parallel prefetch requires editing the top of main.tsx and adding another ensure*Completed() call to the preAction hook. There’s no registry of “things to prefetch” or a declarative way to say “this initialization can run in parallel with imports.” A prefetch registry that collects operations and runs them via Promise.all() would scale better.
If You Were Building This From Scratch
The core insight — start slow operations before you need their results — is universally applicable. But you can make it more maintainable with an explicit initialization framework:
Improved Boot Phases
Each phase declares its dependencies explicitly. The boot framework resolves them, runs independent operations in parallel, and guarantees ordering for dependent ones. Adding a new prefetch operation means adding an entry to Phase 1 — no need to carefully position a line of code before other imports.
Key takeaways for your own AI tool harness:
- Parallel initialization is free performance. Identify what can run concurrently and start it from day one. Don’t wait for a performance crisis to add prefetching.
- Separate the entry point from the orchestrator. The file that receives control from the OS should do one thing: call the boot framework. All sequencing logic belongs in a dedicated module.
- Gate startup cost behind actual use. If a user runs
--help, don’t initialize auth. If telemetry is disabled, don’t load the SDK. ThepreActionhook pattern is a simple and effective way to defer work. - Test your boot time in CI. Record startup duration as a metric and fail builds that regress past a threshold. Performance is a feature; treat it like one.
This boot phase pattern feeds directly into Post 10’s blueprint, where we’ll assemble a complete initialization framework for a custom AI coding harness.
This is Part 1 of the “Inside Claude Code” series — a deep dive into Anthropic’s AI coding tool, reverse-engineered from its leaked source code.
Next up: Now that the CLI is running and the REPL is mounted, what happens when you actually send a message? In Part 2: How an AI Agent Thinks in a Loop, we’ll trace a single user message through the query engine, the model API call, tool execution, and back — the core loop that makes Claude Code an agent and not just a chatbot.