Claude Code Architecture - The 4-Layer System

Glen Beck and Betty Snyder programming the ENIAC computer at the Ballistic Research Laboratory in 1946. Two operators stand among floor-to-ceiling racks of vacuum tubes. — Glen Beck and Betty Snyder programming the ENIAC, 1946. The original software stack was physical — racks, switches, and patch cords. Claude Code's 4-layer system is the same problem solved in markdown. Source: Wikimedia Commons · public domain.

Summary

Claude Code’s real power emerges when treated as a 4-layer architecture — memory, skills, hooks, and agents — rather than a single prompt interface. Each layer handles a different type of concern, and the system compounds in capability over time.

Overview

The dominant mistake developers make with Claude Code is treating it as a sophisticated autocomplete: open the terminal, type a request, hope for the best. Practitioners who’ve extracted serious productivity gains — like Boris Cherny, Anthropic’s Head of Claude Code, who reports shipping 10-30 PRs daily without hand-editing code — describe a fundamentally different approach. They configure Claude Code as a layered system where each layer handles a specific type of concern.

Prakash Sharma and Shraddha Bharuka independently arrived at the same framework: four distinct layers that transform Claude Code from a reactive tool into a proactive engineering partner.

Layer 1: Memory (CLAUDE.md)

The foundation layer. A CLAUDE.md file provides persistent project context — purpose, file map, rules, and conventions. This is the “who am I and what do I care about” layer. Without it, every session starts from zero. With it, Claude operates with accumulated project knowledge.

Memory isn’t limited to a single file. Local CLAUDE.md files in subdirectories add module-specific context, and tools like QMD (referenced by Tom Crawshaw) make past sessions searchable, giving Claude a form of episodic memory across conversations.

Layer 2: Skills (reusable expert modes)

Skills are packaged instructions for specific workflows — code review, refactoring, release management, debugging, documentation generation. They live in .claude/skills/ and are invoked on demand rather than loaded into every interaction.

The distinction matters for context efficiency. A skill for “production release checklist” might be 50 lines of detailed procedure. Loading that into every interaction wastes tokens. Keeping it as a skill means it’s available when needed and invisible when not.

Nick Spisak makes a critical architectural point: individual skills are just fancy prompts. The real leverage comes from building connected skill systems. Three patterns define a working skill architecture: shared context files that create unified voice across a pipeline, output-as-input chaining where one skill’s output feeds the next, and plugin composition where skills compound on each other. A well-designed 4-layer production pipeline can reduce manual orchestration from 30-40 minutes of hands-on work to 15-20 minutes of review.

Layer 3: Hooks (deterministic guardrails)

Hooks are automated checks that run at specific trigger points — before a commit, after a file save, before a deployment. Unlike skills (which Claude chooses to invoke), hooks fire automatically and deterministically. They’re the safety net layer.

Common hooks include auto-formatting on save, running the test suite before any commit, blocking changes to protected directories (auth modules, infrastructure code, production configs), and validating that generated code meets type-checking requirements.

Shraddha Bharuka frames hooks as the answer to a fundamental trust problem: you want Claude to move fast, but you need hard stops on certain actions. Hooks let you grant autonomy within boundaries.

Layer 4: Agents (sub-workers)

For complex tasks, Claude can spawn sub-agents — child processes that handle specific pieces of work while the main agent coordinates. This keeps the primary context window clean and allows parallel work streams.

The agent layer enables patterns like: main agent plans the approach, spawns a research agent to gather context, spawns an implementation agent for the code, then reviews and integrates the results. Kshitij Mishra describes this as delegating large tasks to sub-agents to prevent context pollution in the main session.

Tom Dörr’s collection of 51 agent personalities for Claude Code illustrates the breadth: specialised agents for security review, performance optimisation, documentation, test writing, and more — each with tuned instructions for their domain.

Performance optimisations

Beyond the architectural layers, specific technical settings dramatically affect Claude Code’s effectiveness. Om Patel highlights the ENABLE_LSP_TOOL flag, which connects Claude to Language Server Protocol servers (the same technology powering VS Code’s “go to definition” feature). The difference is stark: default text-based grep takes 30-60 seconds and often finds the wrong files, while LSP-based navigation resolves in ~50ms, traces actual call hierarchies, and catches type errors immediately after edits.

The Boris Cherny benchmark

In a 90-minute interview on Lenny’s Podcast (summarised by Anish Moonka), Boris Cherny — who leads Claude Code development at Anthropic — shared several principles from internal use:

Coding is largely solved as a bottleneck. The next frontier is AI deciding what to build by scanning signals across Slack, bug trackers, and telemetry. Anthropic has seen 200% productivity increases per engineer, which Cherny calls unprecedented in dev tooling history. The advice: give models tools and goals, not rigid step-by-step orchestration. General models beat specialised pipelines over time, and teams should build for model capabilities six months out rather than today’s constraints.

Perhaps most provocatively: the job title “software engineer” is shifting toward “builder,” with generalists who can orchestrate AI systems outperforming narrow specialists.

Sources

@BharukaShraddha — 4-layer architecture, hooks as trust boundaries, local CLAUDE.md pattern
@PrakashS720 — 4-layer system framing, system-first over prompt-first
@AnishA_Moonka — Boris Cherny interview summary, 200% productivity, builder role shift
@NickSpisak_ — connected skill systems, output-as-input chaining, pipeline composition
@om_patel5 — ENABLE_LSP_TOOL flag, LSP vs grep performance
@tomcrawshaw01 — QMD persistent memory, session searchability
@tom_doerr — 51 AI agent personalities for Claude Code