On March 31, 2026, security researcher Chaofan Shou discovered that in the Anthropic npm release of the Claude Code package, the source map file was not stripped.
This means that the full TypeScript source code of Claude Code, 512,000 lines, 1903 files, was thus exposed on the public internet.
Of course, I couldn't possibly read through so much code in just a few hours, so I approached this source code with three questions:
1. What is the fundamental difference between Claude Code and other AI programming tools?
2. Why does its code-writing "feel" just seem better than others?
3. What exactly is hidden in 510,000 lines of code?
After reading through it, my initial reaction was: this is not just an AI programming assistant; this is an operating system.
I. Let's Start with a Story: If You Were to Hire a Remote Programmer
Imagine you hired a remote programmer and gave them remote access to your computer.
How would you handle it?
If you were Cursor: you would have them sit next to you, and every time they need to type a command, you would glance over and click "allow." It's straightforward, but you have to keep an eye on them at all times.
If you were a GitHub Copilot Agent: you would give them a brand-new virtual machine to play around in. After they finish, they submit the code, you review it, then merge it. It's secure, but they can't see your local environment.
If you were Claude Code:
You would let them use your computer directly—but you've equipped it with an extremely sophisticated security system. What they can do, what they cannot do, which actions require your approval, which they can do on their own, and even if they want to use rm -rf, it must pass through 9 levels of review before execution.
Here are three completely different security philosophies:

Why did Anthropic choose the hardest path?
Because only this way, AI can work with your terminal, your environment, your configuration - this is what "truly helping you code" means, instead of "writing a piece of code in a clean room and then copying it over".
But what is the cost? They wrote 510,000 lines of code for this.
II. Your Perception of Claude Code vs Actual Claude Code
Most people think AI programming tools work like this:
User Input → Call LLM API → Get Result → Show to User
The Actual Claude Code works like this:
User Input
→ Dynamically assemble 7 layers of system prompts
→ Inject Git state, project conventions, historical memory
→ 42 tools each come with a manual
→ LLM decides which tool to use
→ 9 layers of security review (AST parsing, ML classifiers, sandbox checks...)
→ Permission race resolution (local keyboard/IDE/hook/AI classifier all racing simultaneously)
→ 200ms anti-fatigue delay
→ Execute tool
→ Return results in a streaming fashion
→ Is the context approaching the limit? → Three-stage compression (micro-compression → auto-compression → full compression)
→ Need parallelism? → Generate a swarm of sub-agents
→ Loop until task completion
I believe everyone is very curious about the above, but don't worry, let's unpack them one by one.
III. The First Secret: Prompts are not written, they are "assembled"
Open src/constants/prompts.ts, and you will see this function:

Notice that SYSTEM_PROMPT_DYNAMIC_BOUNDARY?
This is a cache delimiter. The content above the delimiter is static and can be cached by the Claude API to save on token costs. The content below the delimiter is dynamic — your current Git branch, your CLAUDE.md project configuration, your previously provided preference memories... each interaction is unique.
What does this mean?
Anthropic treats cue words as compiler output to optimize. The static part is the "compiled binary," and the dynamic part is the "runtime parameters." The benefits of this approach are:
1. Cost-saving: The static part is cached, avoiding redundant charges
2. Speed: Cache hits directly skip processing those tokens
3. Flexibility: The dynamic part allows each interaction to be aware of the current environment
Each tool has its own "User Manual"
What's even more astounding is that each tool directory contains a prompt.ts file — this is a User Manual specifically tailored for LLM.
Look at the BashTool's (src/tools/BashTool/prompt.ts, around 370 lines):

This is not a document for humans, it is a code of conduct for AI behavior. Every time Claude Code starts, these rules are injected into the system prompts.
This is why Claude Code never forcefully git push --force on its own, while some tools might — it's not that the model is smarter, it's that the cues have already spelled out the rules.
Plus, Anthropic's internal version is different from what you're using
Code has numerous branches like this:

ant refers to Anthropic internal staff. Their version has more detailed code style guidelines ("Don't write comments unless WHY is not obvious"), a more aggressive output strategy ("Inverted Pyramid Writing"), and some experimental features still in A/B testing (Verification Agent, Explore & Plan Agent).
This illustrates that Anthropic is Claude Code's biggest user. They are using their own product to develop their own product.
Four. The Second Secret: 42 Tools, But You've Only Seen the Tip of the Iceberg
Open src/tools.ts, and you will see the tool registry:

42 tools, but most of them you have never directly seen. That's because many tools are lazily loaded—only when LLM needs them, they are injected on-demand via ToolSearchTool.
Why is this done?
Because for every additional tool, the system prompt needs an additional description, and the token needs to spend more money. If you just want Claude Code to help you change one line of code, it doesn't need to load the 'Cron Task Scheduler' and 'Team Collaboration Manager'.
There is an even smarter design:

Set CLAUDE_CODE_SIMPLE=true, and Claude Code will be left with only three tools: Bash, Read File, Modify File. This is a backdoor for minimalists.
All Tools Come from the Same Factory

Pay attention to those default values: isConcurrencySafe defaults to false, isReadOnly defaults to false.
This is called fail-closed design—if a tool's author forgets to declare the safety attributes, the system will assume it is 'unsafe and writable'. It is better to be overly cautious than to miss a single risk.
The Iron Law of 'Read Before Write'

The FileEditTool will check if you have already read this file using the FileReadTool. If not, it will throw an error directly and not allow modification.
This is why Claude Code won't "magically write a code snippet to overwrite your file" like some tools—**it's required to first understand before modifying**.
5. The Third Secret: Memory System—Why It Can "Remember You"
Anyone who has used Claude Code has a feeling: it seems to really know you.
You tell it "do not mock the database in tests," and it won't mock on the next interaction. You tell it "I'm a backend engineer, React newbie," and it will explain front-end code using backend analogies.
Behind this is a complete memory system.
Using AI to Retrieve Memories

Claude Code uses another AI (Claude Sonnet) to determine "which memories are relevant to the current conversation"
Not keyword matching, not vector search—it lets a small model rapidly scan all memory file titles and descriptions, selecting up to the 5 most relevant, then injecting their full content into the current conversation context.
The strategy is "precision over recall"—rather miss a potentially useful memory than inject an irrelevant one polluting the context.
KAIROS Mode: Nighttime "Dreaming"
This is the most sci-fi part for me.
There's a feature flag in the code called KAIROS. In this mode, memories from long conversations aren't stored in structured files but in date-appended log-like entries. Then, there's a /dream skill that runs during "nighttime" (low activity) distilling these raw logs into structured thematic files.

The AI organizes memories while "sleeping". This is no longer engineering; it's bionics.
6. The Fifth Secret: It's Not an Agent, It's a Group
When you have Claude Code perform a complex task, it might quietly do this:

It generates a sub-Agent.
And the sub-Agent has a strict "self-awareness" injection to prevent it from recursively generating more sub-Agents:

This piece of code is saying: "You are a worker, not a manager. Don't think about hiring more people, do the work yourself."
Coordinator Pattern: Manager Pattern
In the coordinator pattern, Claude Code becomes a pure task orchestrator, not doing the work itself, just delegating:

Core principles written in code comments:
"Parallelism is your superpower" for read-only research tasks: run in parallel. For write file tasks: run in serial per file group (avoiding conflicts).
Optimization of Prompt Cache to the Extreme
To maximize the cache hit rate of sub-Agents, all forked sub-agent utility results use the same placeholder text:
"Fork started—processing in background"
Why? Because Claude's API prompt cache is based on byte-level prefix matching. If the prefix bytes of 10 sub-Agents are identical, then only the first one needs a "cold start," the remaining 9 directly hit the cache.
This is an optimization that saves a few cents per call, but at scale, it can save a significant amount of cost.
Seven, The Sixth Secret: Triple-layer Compression to Ensure "Never Exceed Limits" Dialogues
All LLMs have a context window limit. The longer the conversation, the more historical messages, it will eventually exceed the limit.
Claude Code has designed triple-layer compression for this:
Layer 1: Micro-compression — Minimal Cost

Micro-compression only touches old tool call results — replacing "Content of the 500-line file read 10 minutes ago" with [Old tool result content cleared].
Prompt words and dialogue thread are fully retained.
Layer 2: Auto-compression — Proactive Shrink
When token consumption approaches 87% of the context window (window size - 13,000 buffer), auto-triggered. There is a circuit breaker: stop attempts after 3 consecutive compression failures to avoid a loop.
Layer 3: Full Compression — AI Summarization
Have AI generate a summary of the entire conversation and then replace all historical messages with the summary. There is a strict precept during summary generation:

Why so strict? Because if the AI makes additional tool calls during the summarization process, it would incur more token consumption, counterproductive. This prompt is essentially saying: "Your task is to summarize, do nothing else."
Compressed Token Budget:
· File Recovery: 50,000 tokens
· Per File Cap: 5,000 tokens
· Skill Content: 25,000 tokens
These numbers are not arbitrary — they represent a balance point between "retaining enough context to continue working" and "freeing up enough space to receive new messages."
8. What I Learned After Reading This Source Code
90% of the AI Agent's Work Is Outside "AI"
Within 510,000 lines of code, the portion actually calling the LLM API is likely less than 5%. What about the remaining 95%?
· Security Checks (18 files just for a single BashTool)
· Permission System (allow/deny/ask/passthrough Quadratic Decision)
· Context Management (Three-layer compression + AI Memory Retrieval)
· Error Recovery (Circuit Breaker, Exponential Backoff, Transcript Persistence)
· Multi-Agent Coordination (Swarm Orchestration + Mailbox Communication)
· UI Interaction (140 React Components + IDE Bridge)
· Performance Optimization (Prompt Cache Stability + Parallel Prefetch on Startup)
If you're building an AI Agent product, these are the real problems you need to solve. It's not about how smart your model is; it's about how robust your scaffolding is.
Good Prompt Engineering is System Engineering
It's not just about crafting a nice prompt. Claude Code's prompts include:
· 7-Layer Dynamic Assembly
· Each tool comes with a standalone user manual
· Cache boundaries are precisely delineated
· Internal and external versions have different instruction sets
· Tool ordering is fixed to maintain cache stability
This is engineered prompt management, not craftsmanship.
Design for Failure
Every external dependency has a corresponding failure policy:

Anthropic treats Claude Code as an operating system
42 tools = System Call Permission System = User Permission Management Skill System = App Store MCP Protocol = Device Driver Agent Swarm = Process Management Context Compression = Memory Management Transcript Persistence = File System
This is not a "chatbot plus a few tools"; this is an operating system with LLM at its core.
Summary
510,000 lines of code. 1903 files. 18 secure files just for a single Bash tool.
9 layers of scrutiny just to safely have AI help you type a command.
This is Anthropic's answer: To make AI truly useful, you can't lock it in a cage or let it run wild. You have to build a complete trust framework around it.
And the cost of this trust system is 510,000 lines of code.
