ClawGraph
Language TS / BunvsRust
Codebase 512K LOCvslean CLI
Tools 60+vs~6 core
Sys Prompt ~76 KBvsshorter
Context 200K tokvscached
Sandbox permissionvsOS-native
Starting build...
Explore the graphs below while your repo builds
claude-code
Arch
Architecture Comparison
How these harnesses differ under the hood
Agentic Loop
Claude Code
  • Single-threaded while-loop
  • h2A async dual-buffer queue
  • Reads parallel (×10), writes serial
  • 3 recovery attempts on token limit
Codex
  • HTTP POST to /responses
  • App Server wraps loop (JSON-RPC 1.0)
  • Client-side state in prompt
  • Shared across CLI, VS Code, desktop
Same skeleton — both loop until no tool calls, inject invisible continuation on token limits.
Claude Code runs in query.ts / QueryEngine.ts (~47K lines) with a flat “nO” message history. The h2A queue lets users inject instructions mid-execution. Codex wraps its loop in an App Server (JSON-RPC over stdio/WebSocket) so every surface shares the exact same harness.
Models
Claude Code
  • Sonnet 4.6 default
  • Opus 4.6 via --model flag
  • Haiku for metadata tasks
  • Dual-model architecture
Codex
  • GPT-5.3-Codex default
  • GPT-5.4 recommended upgrade
  • Spark fast variant for Pro
  • Hours-long autonomous runs
Both use tiered model selection — premium for coding, lightweight for metadata.
Claude Code’s dual-model architecture uses Haiku for metadata tasks like summarization and sub-agent exploration, reserving Sonnet/Opus for coding. Codex’s GPT-5.3 uses “medium” reasoning effort by default and supports hours-long autonomous execution with GPT-5.4.
System Prompt
Claude Code
  • ~76 KB dynamically assembled
  • Modular cached sections
  • Environment context appended
  • Single system field in API
Codex
  • Shorter prompt.md in repo
  • Role-based: system / dev / user
  • Server-injected model preambles
  • config.toml + AGENTS.md layers
Both share the same ethos: be concise, technically accurate, keep going until done.
Claude Code splits its 76KB prompt by a SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker into cacheable static portions and per-session dynamic portions. Codex structures prompts through the Responses API’s role-based assembly, letting OpenAI inject model-specific preambles server-side.
Tool Arsenal
Claude Code
  • 60+ tools, ~18 deferred-loaded
  • ToolSearchTool for on-demand load
  • Search-and-replace file editing
  • Built-in LSP (9 operations)
Codex
  • ~6 core tools, always available
  • RL-trained apply_patch (signature)
  • Unified diff patch editing
  • Shell tools for code navigation
Breadth vs depth — Claude covers more surface, Codex trains deeper on fewer tools.
Claude Code sorts tools alphabetically for prompt cache hits and defers ~18 tools via ToolSearchTool. Each tool conforms to a Tool<Input, Output, Progress> interface with Zod validation. Codex’s apply_patch is its signature — models are RL-trained to produce unified diffs, making large edits more precise.
Context Management
Claude Code
  • 200K tokens (1M beta)
  • MicroCompact + AutoCompact
  • Triggers at ~83.5% utilization
  • 60-80% reduction via summaries
Codex
  • Prompt caching (prefix match)
  • Encrypted /responses/compact
  • Static content at prompt start
  • AGENTS.md capped at 32 KB
Claude Code summarizes to compress; Codex caches prefixes and encrypts compaction tokens.
AutoCompact triggers at ~83.5% utilization, generating up to 20K-token summaries with a 13K-token buffer, achieving 60-80% reduction. Codex’s /responses/compact returns encrypted type=compaction items preserving the model’s latent understanding without full history — a fundamentally different approach.
🛡 Sandboxing
Claude Code
  • Risk levels: low / med / high
  • 5 permission modes
  • Glob pattern rules
  • Priority cascade across configs
Codex
  • macOS Seatbelt, Linux bwrap
  • seccomp syscall filtering
  • Network off by default
  • .git always read-only
Permission prompts vs OS isolation — trust-the-user vs trust-the-sandbox.
Claude Code classifies actions as LOW/MEDIUM/HIGH risk across five modes: default, plan, acceptEdits, bypassPermissions, dontAsk. Codex uses Apple Seatbelt on macOS and Bubblewrap+seccomp on Linux with network disabled by default — every spawned subprocess inherits sandbox policies.
API Structure
Claude Code
  • Anthropic Messages API
  • Adaptive extended thinking
  • Client attestation hashes
  • Single system prompt field
Codex
  • OpenAI Responses API
  • Role separation: sys / dev / user
  • SSE streaming + response chain
  • Server-side model optimization
Role separation lets OpenAI inject model-specific preambles server-side.
Claude Code sends to the Messages API with adaptive extended thinking, client attestation hashes, and session IDs for proxy aggregation. Codex uses the Responses API’s role separation (system/developer/user) with previous_response_id chaining and Server-Sent Events streaming.
Converging Designs, Diverging Bets
SHARED FOUNDATION agent loop · markdown memory · permissions subagent spawning · auto-compaction ANTHROPIC “Prompt Breadth” 60+ tools · 76KB prompt Multi-layer compression TypeScript / Bun OPENAI “Model Depth” RL-trained tools · OS sandbox Encrypted compaction Rust binary
OpenAI Codex