<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[INTERNALS.md]]></title><description><![CDATA[I write about what happens inside the systems we build on. Distributed systems, AI, software engineering — one deep dive at a time.

INTERNALS.md by Laxmena.]]></description><link>https://internals.laxmena.com</link><image><url>https://substackcdn.com/image/fetch/$s_!R_0n!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3628c610-2c13-46f8-b6ad-429271a5ddc9_1280x1280.png</url><title>INTERNALS.md</title><link>https://internals.laxmena.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 28 Jun 2026 12:47:50 GMT</lastBuildDate><atom:link href="https://internals.laxmena.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Lax Meiyappan]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[internalsmd@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[internalsmd@substack.com]]></itunes:email><itunes:name><![CDATA[Lax Meiyappan]]></itunes:name></itunes:owner><itunes:author><![CDATA[Lax Meiyappan]]></itunes:author><googleplay:owner><![CDATA[internalsmd@substack.com]]></googleplay:owner><googleplay:email><![CDATA[internalsmd@substack.com]]></googleplay:email><googleplay:author><![CDATA[Lax Meiyappan]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Why Claude Code’s Agent Loop Is Over 1,400 Lines]]></title><description><![CDATA[INTERNALS.md #4 &#183; The nine failure modes that turned a simple loop into query.ts.]]></description><link>https://internals.laxmena.com/p/why-claude-codes-agent-loop-is-over</link><guid isPermaLink="false">https://internals.laxmena.com/p/why-claude-codes-agent-loop-is-over</guid><dc:creator><![CDATA[Lax Meiyappan]]></dc:creator><pubDate>Wed, 03 Jun 2026 14:42:47 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UIHY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UIHY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UIHY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 424w, https://substackcdn.com/image/fetch/$s_!UIHY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 848w, https://substackcdn.com/image/fetch/$s_!UIHY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 1272w, https://substackcdn.com/image/fetch/$s_!UIHY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UIHY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png" width="1400" height="700" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:700,&quot;width&quot;:1400,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73459,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/200399844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UIHY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 424w, https://substackcdn.com/image/fetch/$s_!UIHY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 848w, https://substackcdn.com/image/fetch/$s_!UIHY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 1272w, https://substackcdn.com/image/fetch/$s_!UIHY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe27f8a64-edd6-420d-b138-032dbda137ae_1400x700.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When you type a prompt in Claude Code, a loop starts. Nine conditions can keep it running automatically, without asking you for input. Most of them have nothing to do with whether your task is finished.</p><p>The loop lives in <code>query.ts</code>, Claude Code&#8217;s core agent logic file, exposed via an npm source map in March 2026 and since analyzed by several engineers in the community. One <code>while(true)</code> spanning over 1,400 lines of production TypeScript.</p><p>This post opens that loop.</p><div><hr></div><p><strong>A note on sources.</strong> The implementation details here are based on the Claude Code v2.1.88 source, exposed via an npm source map in March 2026. <code>query.ts</code> is not publicly documented. Everything in this post is community analysis, cross-referenced against the official Claude Code docs. Anthropic ships frequently, so the architecture is stable but specific line numbers are not.</p><div><hr></div><p><strong>The Blueprint:</strong></p><ol><li><p><strong>Start here:</strong> the naive version, and what breaks it</p></li><li><p><strong>The architecture:</strong> why an async generator, why single-threaded, and what the choice costs</p></li><li><p><strong>Before you type anything:</strong> what loads at session start and what it spends</p></li><li><p><strong>The nine conditions:</strong> each failure mode that grew the loop past 1,400 lines</p></li></ol><div><hr></div><h2>Start here: the naive version</h2><p>Every agent tutorial gives you this:</p><pre><code><code>while True:
    response = call_llm(messages)
    if response.tool_calls:
        results = execute_tools(response.tool_calls)
        messages.append(results)
    else:
        break</code> </code></pre><p>While that 'naive' approach works fine in a controlled demo - where the API is stable and the task is trivial, it hits a wall the moment it encounters the messy reality of production - where network jitter is a given and long-running scripts have a habit of hanging indefinitely.</p><p>Then production happens. The API times out mid-response. The context window fills after 20 tool calls. A bash script hangs with no exit condition and freezes the terminal. A governance hook needs to review the output before the session ends. The laptop sleeps. The session is gone.</p><p>Each of those is a line in <code>query.ts</code>. The production version handles all of them. The rest of this post explains how.</p><div><hr></div><h2>The loop is not the smart part</h2><p>Worth stating before anything else.</p><p><code>query.ts</code> is a state machine. It calls an API, reads the response, executes whatever tools the model requested, feeds the results back, and calls the API again. The intelligence is entirely in the model. The loop keeps things correct when production introduces conditions the model cannot handle on its own.</p><p>That framing matters for every design decision that follows.</p><div><hr></div><h2>The architecture</h2><p>Claude Code is built on a JavaScript runtime (Bun, per the v2.1.88 source analysis) with TypeScript in strict mode. The UI is Ink, a React implementation that renders to terminal cells. The agent logic lives in <code>query.ts</code>.</p><p>The loop is declared as an async generator:</p><pre><code><code>export async function* query(
  params: QueryParams,
): AsyncGenerator&lt;
  | StreamEvent
  | RequestStartEvent
  | Message
  | TombstoneMessage
  | ToolUseSummaryMessage,
  Terminal
&gt;
</code></code></pre><p>Two decisions are embedded in this signature.</p><p><strong>Typed events, not buffered output.</strong> The loop yields events as they happen: text tokens, tool calls, tool results, compaction markers. Each one renders to the terminal immediately. The result is that the user sees output character by character, with no buffering between the API and the screen.</p><p><strong>A typed exit reason.</strong> The loop returns <code>Terminal</code>, not <code>void</code>. When it exits, it tells the caller exactly why: task complete, context limit hit, user interrupted, budget exhausted. Session recovery, Remote Control reconnection, and resume behavior all depend on this.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Huw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Huw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 424w, https://substackcdn.com/image/fetch/$s_!6Huw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 848w, https://substackcdn.com/image/fetch/$s_!6Huw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 1272w, https://substackcdn.com/image/fetch/$s_!6Huw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Huw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png" width="1200" height="753" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:753,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:92564,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/200399844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Huw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 424w, https://substackcdn.com/image/fetch/$s_!6Huw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 848w, https://substackcdn.com/image/fetch/$s_!6Huw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 1272w, https://substackcdn.com/image/fetch/$s_!6Huw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F40b63724-0497-4db0-b0f5-3f9d29475570_1200x753.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The loop is single-threaded: one instance, one thread, one loop. There is no shared mutable state between concurrent operations, no locks, and no race conditions within a session.</p><p>This is a deliberate bet. Parallelism adds throughput, but it also adds shared state between concurrent tool executions. Shared state introduces a class of bugs that single-threaded designs cannot have.</p><p>Ultimately, Anthropic favored a design that prioritizes deterministic, single-threaded correctness over the throughput gains you might get from aggressive parallel execution.</p><p>The cost of that bet is specific. A bash tool that runs a tight loop, a synchronous operation that blocks for minutes, a script with no exit condition: none of these can be interrupted from within the loop. The event loop freezes until the operation completes or the user hits Ctrl+C. That is the production risk this architecture accepts.</p><blockquote><p><strong>&#8595; INTERNALS: Architectural Tradeoffs 1</strong></p><p>Early versions of Claude Code used recursion. The query function called itself. In long sessions with hundreds of tool calls, the call stack grew until it exploded. The current design carries mutable state in a <code>State</code> object between iterations. Each <code>continue</code> is a state transition, and a <code>transition</code> field records the reason, so error recovery and compaction logic know what happened in the previous turn.</p></blockquote><blockquote><p><strong>&#8595; INTERNALS: Architectural Tradeoffs 2</strong></p><p>The async model is cooperative multitasking, not preemptive. Tasks yield control voluntarily at <code>await</code> and <code>yield</code> points. While the loop waits on an API response, the same thread handles terminal rendering, keypress events, and JSONL session writes. Nothing runs simultaneously; everything is interleaved. This keeps the terminal responsive even during long tool executions.</p></blockquote><div><hr></div><h2>Before you type anything</h2><p>Before you ever see a prompt cursor, Claude Code performs a complex ten-step initialization. </p><p>It loads settings in a specific priority hierarchy with managed policies acting as the immutable overrides, followed by a recursive walk up your directory tree to hunt for project-level rules, memory files, and MCP configurations. </p><p>Only after aggregating all that state does it finally assemble the system prompt.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Vv_y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Vv_y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 424w, https://substackcdn.com/image/fetch/$s_!Vv_y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 848w, https://substackcdn.com/image/fetch/$s_!Vv_y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 1272w, https://substackcdn.com/image/fetch/$s_!Vv_y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Vv_y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png" width="1200" height="732" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:732,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:95231,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/200399844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Vv_y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 424w, https://substackcdn.com/image/fetch/$s_!Vv_y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 848w, https://substackcdn.com/image/fetch/$s_!Vv_y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 1272w, https://substackcdn.com/image/fetch/$s_!Vv_y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F87c17dde-8632-4c85-ad67-f716745d89ab_1200x732.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The cost before you type a single character: approximately 8,000 to 12,000 tokens. On a 200K context window that is 4 to 6%. On a 32K window it is 25 to 37%, which constrains what tasks are feasible before the first token of actual work.</p><p>One detail worth noting: CLAUDE.md injects as a user message, not part of the system prompt. The model treats system prompt content as configuration and user message content as context. That distinction affects how strictly it follows CLAUDE.md instructions.</p><blockquote><p><strong>&#8595; INTERNALS</strong></p><p>The system prompt is not a single string. Claude Code assembles it from conditional fragments based on your environment, settings, and active context. The base system role definition is approximately 2,900 tokens. Tool definitions add roughly 3,000 more for 18+ tools. CLAUDE.md content adds between 500 and 2,000. The assembled prompt is a dense technical document: behavioral rules, tool usage philosophy, coding style guidelines. The line &#8220;three similar lines of code is better than a premature abstraction&#8221; is stated in the system prompt. This is the model&#8217;s operating manual.</p></blockquote><div><hr></div><h2>Two models, one pipeline</h2><p>Every reasoning turn runs on an Opus-class model (claude-opus-4-6 per the v2.1.88 analysis, though this may shift across versions). Opus sees the full system prompt, the complete tool catalog, and the conversation history. It decides what to do next.</p><p>A smaller, faster model handles exactly two background tasks.</p><p>A warmup request fires at session start: one intentionally truncated API call with <code>stop_reason: max_tokens</code>. The response content is irrelevant. This is a health check, verifying the API is reachable and the quota is valid.</p><p>File path extraction runs after bash commands produce output, identifying which file paths appear in the results. The prompt is concise and single-purpose (179 words in v2.1.88).</p><p>Every reasoning turn goes through Opus. The smaller model handles only those two tasks.</p><blockquote><p><strong>&#8595; INTERNALS</strong></p><p>Claude Code uses aggressive prompt caching. Content blocks carry <code>cache_control</code> breakpoints. The first pass writes to a server-side cache at 1.25x input cost. Subsequent requests with matching prefixes pay approximately 10% of normal cost. In observed session traces, 90% or more of tokens per turn are served from cache. Without caching, the economics of long sessions would be brutal. Prompt caching is what makes the model feel like it &#8220;remembers&#8221; the session despite being stateless on every API call.</p></blockquote><div><hr></div><h2>The nine conditions</h2><p>The loop terminates when a terminal condition is met. Nine conditions cause it to run another iteration automatically, without asking the user. Each started as a bug report.</p><p><strong>&#9312; Tool result received.</strong> The model calls a tool, which executes and injects its result back into context. The loop continues so the model can decide what to do with it. This is the core of the agentic pattern.</p><p><strong>&#9313; API error or network failure.</strong> When the request fails, the loop retries with exponential backoff, up to 10 times. The user sees nothing.</p><p><strong>&#9314; Max output tokens mid-response.</strong> The model hit its output limit before finishing. The API signals this with <code>stop_reason: max_tokens</code>, and the loop continues to let it complete.</p><p><strong>&#9315; Token budget warning injected.</strong> As the session approaches the context limit, the loop injects a warning telling the model to start wrapping up, then continues so the model can actually read it.</p><p><strong>&#9316; Proactive compaction triggered.</strong> When context is full, one of four compaction strategies runs and the loop continues with the compressed result. The pipeline is covered in the next section.</p><p><strong>&#9317; Reactive compact after 413.</strong> The API rejected the request: context too large. The proactive strategies ran and were not enough. Reactive compact runs a full context collapse. The full mechanics are in the next section.</p><p><strong>&#9318; Sub-agent Task tool completes.</strong> When a child agent spawned via the Task tool finishes, the parent loop continues. Sub-agents are implemented as tools, so the parent loop cannot distinguish between waiting on a child agent and waiting on a file read. The tradeoff: the parent receives only a summary of the child&#8217;s work (estimated at 1,000 to 2,000 tokens in the source analysis), so intermediate steps and decisions inside the sub-agent are invisible to it.</p><p><strong>&#9319; Stop hook signals continue.</strong> When the model signals it is done, a user-defined stop hook can override that and keep the loop running. Stop hooks are a governance layer the model cannot override.</p><p><strong>&#9320; Pre-tool hook blocks execution.</strong> A hook intercepted a tool call before execution and denied it. The denial injects as a tool result. The loop continues so the model can reason about the denial.</p><div><hr></div><h2>The compaction pipeline</h2><p>When the context window fills, the loop does not crash. It compacts.</p><p>Four proactive strategies run in order, cheapest first. <strong>Tool Result Budget</strong> truncates oversized tool outputs with no model call needed. <strong>Snip Compaction</strong> removes older tool results from history, also without generation. <strong>Microcompaction</strong> summarizes middle conversation turns one at a time using a scoped model call. <strong>Autocompact</strong> fires at a usage threshold, first attempting Session Memory Compaction (extracting key facts into a compact object) and falling back to full conversation compaction if that is not enough.</p><p>The ordering is not arbitrary. You exhaust the cheap options before paying for the expensive ones.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uoiz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uoiz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 424w, https://substackcdn.com/image/fetch/$s_!Uoiz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 848w, https://substackcdn.com/image/fetch/$s_!Uoiz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 1272w, https://substackcdn.com/image/fetch/$s_!Uoiz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uoiz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png" width="1200" height="766" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:766,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:95200,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/200399844?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uoiz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 424w, https://substackcdn.com/image/fetch/$s_!Uoiz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 848w, https://substackcdn.com/image/fetch/$s_!Uoiz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 1272w, https://substackcdn.com/image/fetch/$s_!Uoiz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F994e0761-9198-4683-bb5a-a850cf52bb00_1200x766.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Reactive compact is separate. It fires only after the API returns 413, which means the proactive strategies have already run and were not enough. The loop collapses the full context and continues. The circuit breaker <code>hasAttemptedReactiveCompact</code> ensures this fires once only; a second 413 is a terminal condition.</p><p>Think of it as a two-stage safety system: proactive compaction keeps you under the limits during normal operation, while the reactive cycle acts as your emergency circuit breaker when things inevitably go sideways.</p><blockquote><p><strong>&#8595; INTERNALS</strong></p><p>Sub-agents are context-efficient for a specific reason. A sub-agent runs its own full loop internally, potentially consuming 100K or more tokens on a complex task, then returns only a compact summary to the parent. The parent pays for the summary, not the full work. This is why the Task tool is the right primitive for parallelizing work in a single-threaded system: each sub-agent has its own isolated context window.</p></blockquote><div><hr></div><h2>Session persistence</h2><p>Every message, tool call, and result writes to a JSONL file under <code>~/.claude/projects/</code> in real time, as each event happens rather than at session end.</p><p>Three things depend on this.</p><p><code>claude --resume</code> reconstructs exact conversation state at the point of interruption. <code>--fork-session</code> copies history into a new session at any chosen point, leaving the original unchanged. Remote Control reconnects automatically if the laptop sleeps and wakes, because session state is on disk rather than in memory.</p><p>The <code>TombstoneMessage</code> type in the generator signature connects to this log. When compaction removes messages, their tombstone stays in the JSONL so the replay log stays consistent after compression.</p><div><hr></div><h2>How other loops handle the same problems</h2><p><strong>The naive loop</strong> has none of the nine conditions. Context overflow crashes it. Network failures terminate it. It is the right starting point, not a production architecture.</p><p><strong>Hermes Agent</strong> (Nous Research, MIT license) takes a different position on parallelism. When the model requests multiple tools, Hermes executes them through a thread pool rather than sequentially. The throughput gain is real. So is the exposure: two tools executing in parallel can write to the same file at the same time. Race conditions in tool output are a category of bugs that Claude Code&#8217;s single-threaded model cannot produce.</p><p><strong>LangGraph</strong> is not an agent loop. It is a framework for constructing them. Human-in-the-loop pauses are explicit: <code>interrupt()</code> stops execution, <code>Command(resume=value)</code> continues it. Claude Code&#8217;s permission system is implicit, handled internally with no graph definition required. The explicit approach is more debuggable; the implicit one requires less setup. Hermes is model-agnostic across providers; Claude Code is tied to Anthropic&#8217;s API.</p><div><hr></div><p>The loop is not the smart part of Claude Code. The model is.</p><p><code>query.ts</code> exists to keep the model correct when production introduces conditions the model cannot handle on its own: lost connections, context limits, governance hooks, failed tool calls, slow networks.</p><p>Every line beyond that naive version exists because something failed in production.</p><p>If your agent loop is shorter, you have not hit those failures yet.</p><div><hr></div><p><em>INTERNALS.md is a technical series on how production systems actually work. No tutorials. No framework evangelism. Just the layer beneath.</em></p><p><em>If this was useful, the best thing you can do is share it with one engineer who would care.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://internals.laxmena.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share INTERNALS.md&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://internals.laxmena.com/?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share INTERNALS.md</span></a></p><div><hr></div><p><strong>Sources</strong></p><ul><li><p>Claude Code official docs: <a href="https://code.claude.com/docs/en/how-claude-code-works">how-claude-code-works</a>, <a href="https://platform.claude.com/docs/en/agent-sdk/agent-loop">agent-sdk/agent-loop</a>, <a href="https://code.claude.com/docs/en/remote-control">remote-control</a>, <a href="https://code.claude.com/docs/en/channels">channels</a></p></li><li><p>Harrison Guo: <a href="https://harrisonsec.com/blog/claude-code-deep-dive-query-loop/">Claude Code Deep Dive Part 2 &#8212; The 1,421-Line While Loop</a></p></li><li><p>Jonas Kim: <a href="https://bits-bytes-nn.github.io/insights/agentic-ai/2026/03/31/claude-code-architecture-analysis.html">Claude Code Architecture Analysis</a> (v2.1.88 source analysis)</p></li><li><p>Zain Hasan: <a href="https://zainhas.github.io/blog/2026/inside-claude-code-architecture/">Inside Claude Code &#8212; An Architecture Deep Dive</a></p></li><li><p>Alex (AgenticLoops): <a href="https://agenticloopsai.substack.com/p/disassembling-ai-agents-part-2-claude">Disassembling AI Agents Part 2 &#8212; Claude Code</a></p></li><li><p>VILA-Lab: <a href="https://github.com/VILA-Lab/Dive-into-Claude-Code">Dive into Claude Code</a></p></li><li><p>Arize AI: <a href="https://arize.com/blog/how-hermes-implements-open-source-agent-harness-architecture/">How Hermes Implements an Open-Source Agent Harness Architecture</a></p></li><li><p>Ken Huang: <a href="https://kenhuangus.substack.com/p/chapter-1-the-harness-paradigm-claude">Chapter 1 &#8212; The Harness Paradigm</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Your embedding model doesn’t understand your data]]></title><description><![CDATA[INTERNALS.md #3 &#183; It never did. Here&#8217;s what it actually does, and why that matters for every RAG system you&#8217;ll ever build.]]></description><link>https://internals.laxmena.com/p/your-embedding-model-doesnt-understand</link><guid isPermaLink="false">https://internals.laxmena.com/p/your-embedding-model-doesnt-understand</guid><dc:creator><![CDATA[Lax Meiyappan]]></dc:creator><pubDate>Tue, 19 May 2026 14:31:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!BOgj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BOgj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BOgj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BOgj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BOgj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BOgj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BOgj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1469306,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/197957601?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BOgj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!BOgj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!BOgj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!BOgj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb1fed45-c9dd-4c37-83f1-2aff341ecd73_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p>Here&#8217;s a bug that doesn&#8217;t show up in your logs.</p><p>You ship a RAG system. Users ask questions about internal data: support tickets, product docs, sales notes. Cosine similarity scores come back at 0.74, 0.81, 0.78. The LLM generates a confident, fluent answer.</p><p>While the resulting answers aren&#8217;t always obviously broken, they fail in a specific, repeatable pattern.</p><ul><li><p>Someone asks about &#8220;pipeline&#8221; and gets sales documents when they meant data infrastructure. </p></li><li><p>Someone asks about &#8220;incident&#8221; and gets both engineering postmortems and customer support tickets, randomly. </p></li><li><p>Someone asks about &#8220;ARR attainment&#8221; and gets a document about spreadsheet formulas.</p></li></ul><p>You tune the prompts. You adjust chunk sizes. The results are still wrong.</p><p>Prompt engineering and chunking strategies fail here because the root cause lies in how the foundational vector space is created.</p><div><hr></div><p><em>A note before we start. This post assumes you&#8217;ve shipped or worked on a RAG system and have firsthand experience with it underperforming on domain-specific queries. If you need a foundation first, <a href="https://www.pinecone.io/learn/retrieval-augmented-generation/">this primer</a> is a solid 10 minutes. You&#8217;ll get more from this post with that context.</em></p><div><hr></div><h4><strong>The Blueprint:</strong></h4><ul><li><p><strong>The Illusion:</strong> Why your embedding model is a map of the internet, not an understanding machine.</p></li><li><p><strong>The Geometry:</strong> How high-dimensional space pathology compresses your similarity scores.</p></li><li><p><strong>The Failure Modes:</strong> How to diagnose and fix the 4 silent bugs killing domain retrieval (including <em>Hubness</em> and <em>Concept Collision</em>).</p></li><li><p><strong>The Playbook:</strong> A 3-step engineering roadmap to evaluate, adapt, and fine-tune your space.</p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://internals.laxmena.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">If you value technical breakdowns that focus on the underlying system mechanics rather than high-level abstractions, join the newsletter.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>What you think is happening</h2><p>Most engineers picture an embedding model as a kind of understanding machine.</p><p>You feed it text. It reads the text, grasps the meaning, and produces a number that represents that meaning. </p><p>Two pieces of text with similar meanings get similar numbers. You compare numbers. You find meaning.</p><p>This mental model feels right. It explains why &#8220;cat&#8221; and &#8220;feline&#8221; end up close together. It explains why the system works at all.</p><p>But this mental model is wrong, and that&#8217;s exactly why several production RAG pipelines fail.</p><div><hr></div><h2>Maps, not minds</h2><p>An embedding model doesn&#8217;t understand anything. They possess zero semantic comprehension. They strictly operate as coordinate systems - a map of language drawn by learning how words and phrases appeared in the internet.</p><p>Every piece of text you feed it gets assigned a location on that map. Texts that appeared together constantly, in the same articles, answering the same kinds of questions, end up near each other. The model never gets to redraw the map when it sees your internal data. It just projects everything onto the existing one, using whatever surface patterns and statistical echoes it recognizes.</p><p><strong>The crucial part: the map was drawn by reading the internet.</strong></p><p>Billions of web pages, Wikipedia articles, Reddit threads, news posts, academic papers. It&#8217;s a dense, detailed map of how language is used on the open web. This is why it works well for general questions. &#8220;Cat&#8221; and &#8220;feline&#8221; appeared near each other constantly. &#8220;Paris&#8221; and &#8220;capital of France&#8221; showed up together in thousands of articles.</p><p>But your company&#8217;s specific use of &#8220;pipeline&#8221;, &#8220;incident&#8221;, &#8220;P0&#8221;, or &#8220;ARR attainment&#8221;? Those meanings were never on the original map. The model does the only thing it can: it finds the nearest coordinates it <em>does</em> have. It always returns something. There is no &#8220;I don&#8217;t know&#8221;.</p><p>Here is the part that makes this dangerous: <strong>the model never warns you.</strong> It returns a confident-looking coordinate and a plausible similarity score regardless of whether your data falls within the model&#8217;s training distribution or unmapped domain. A 0.79 similarity score looks identical for both a perfectly relevant retrieval and a catastrophic silent failure. </p><p>The cosine score only tells you distance on the map. It doesn&#8217;t tell you whether the map covers your territory.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!t2w5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!t2w5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!t2w5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!t2w5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!t2w5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!t2w5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png" width="728" height="386.75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:680,&quot;width&quot;:1280,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:3487954,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/197957601?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!t2w5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!t2w5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!t2w5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!t2w5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1645e03-0b3d-4a40-9dff-a9d00b6f7abe_1280x680.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="callout-block" data-callout="true"><p><strong>&#8595; Internals</strong></p><p>The formal name for this is the <em>distributional hypothesis</em>, stated by linguist J.R. Firth in 1957: &#8220;you shall know a word by the company it keeps&#8221;. Modern embedding models are this hypothesis at scale, with a neural network as the function approximator.</p><p>The model learns: text &#8594; a point in &#8477;&#8319; (768 dimensions for BERT-base, 1536 for OpenAI&#8217;s text-embedding-3-small). Positions are determined entirely by co-occurrence patterns in the training corpus. A concept that appeared with insufficient frequency or in the wrong context distribution gets placed at unreliable coordinates. Not missing, just wrong.</p></div><div><hr></div><h2>The map has a geometry problem</h2><p>Even when you&#8217;re asking about something the model did learn from the internet, the similarity scores can mislead you. And the reason has nothing to do with meaning.</p><p>When you compare two embeddings, you&#8217;re computing the angle between them. Small angle = similar. Large angle = different. This is called cosine similarity.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2wDf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2wDf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!2wDf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!2wDf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!2wDf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2wDf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png" width="1280" height="680" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:680,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3487954,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/197957601?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2wDf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!2wDf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!2wDf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!2wDf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5b1abcf-ce3c-44a5-9dcd-1686c3607885_1280x680.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In 2D, this works well. Arrows can point in many directions. Two random arrows have a wide variety of angles, so you can clearly tell similar from different.</p><p>Now go to 768 dimensions. Something counterintuitive happens.</p><p>In raw high-dimensional spaces, the expected angle between random vectors concentrates near 90&#176;. Modern contrastive objectives (the training method used by OpenAI, Cohere, BGE, and similar models) push useful variance into the tails and partially fix this. But when your data distribution differs from what the model was trained on, the collapse returns in practice: your domain queries end up in a poorly-calibrated corner of the space, and the scores stop being informative.</p><p>There&#8217;s a second problem on top of this.</p><p>Research from 2021 (Timkey &amp; van Schijndel) measured exactly which dimensions drive cosine similarity scores in pre-contrastive BERT-style models. Their finding: <strong>just 1 to 3 dimensions out of 768 dominate the entire similarity calculation.</strong> The other 765 barely matter.</p><p>These are dimensions with unusually high variance that spread wide across the corpus. Because cosine similarity is an angle calculation, whatever dimensions spread widest have the most influence. The good news: simple post-processing (subtracting the mean, standardizing) largely mitigates rogue dimensions, and modern contrastive models have improved significantly. The caveat: when you&#8217;re running a general model on out-of-distribution domain data, these geometric pathologies re-emerge in subtler forms. You won&#8217;t know unless you measure.</p><p></p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dYxI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dYxI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!dYxI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!dYxI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!dYxI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dYxI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png" width="1280" height="680" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:680,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3487954,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/197957601?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dYxI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!dYxI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!dYxI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!dYxI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82395891-b2ce-48cd-977c-1adf1f7d46e8_1280x680.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><div class="callout-block" data-callout="true"><p><strong>&#8595; Internals</strong></p><p>This is called <em>anisotropy</em>. Pre-trained transformer models produce spaces where vectors cluster into a narrow cone rather than spreading evenly. When you compute cosine similarity in that cone, angular distances compress. Everything looks vaguely similar.</p><p>You can verify this on any model. Compute cosine similarity between 1,000 random document pairs from your corpus and plot the distribution. Standard deviation below 0.1 means a compressed space. A well-calibrated contrastive model shows variance above 0.2 on genuinely diverse documents.</p><p>The geometric fix is contrastive training, using (query, positive, hard negative) triplets that push vectors apart more uniformly. This is what separates modern embedding models from older BERT-style ones.</p></div><div><hr></div><h2>Your data is a different country</h2><p>Now put both problems together.</p><p>The map was drawn from the internet. The map&#8217;s geometry loses resolution in high dimensions. And your users are querying for domain-specific contexts that never existed in the base model&#8217;s training corpus.</p><p>In practice: your model confidently navigates general topics. Common business language, standard industry terms, broad concepts. The map is detailed there. But when your users ask about your company&#8217;s specific vocabulary, the model has no coordinates for those meanings. It falls back to surface patterns: which words the document contains, what those words meant on the internet.</p><p>A 2024 study tested seven state-of-the-art embedding models on financial domain text: SEC filings, earnings calls, analyst reports. Every model performed significantly worse on domain text than on the general benchmark. More striking: <strong>a model&#8217;s general benchmark score had almost no correlation with its domain performance.</strong> The rankings reshuffled completely.</p><p>The general benchmark tells you how well the map covers the internet. It says nothing about your territory.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HArL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HArL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!HArL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!HArL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!HArL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HArL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png" width="1280" height="680" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:680,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3487954,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/197957601?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HArL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!HArL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!HArL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!HArL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0ea5a2a9-0458-4354-b4df-32e72bf25ab9_1280x680.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><div class="callout-block" data-callout="true"><p><strong>&#8595; Internals</strong></p><p>Domain failure isn&#8217;t just missing vocabulary. Most domain terms exist somewhere in the training corpus. Words like &#8220;ARR&#8221;, &#8220;churn&#8221;, and &#8220;pipeline&#8221; all appear in web text. The problem is the <em>context distribution</em>.</p><p>The word &#8220;Churn&#8221; in training data appears near &#8220;customer attrition&#8221;, &#8220;SaaS metrics&#8221;, &#8220;subscription&#8221;. In your network operations runbook, it appears near &#8220;packet loss&#8221;, &#8220;network degradation&#8221;. Same word, completely different neighborhood. The model maps it to web-scale coordinates. Your users&#8217; queries navigate to the wrong neighborhood.</p></div><div><hr></div><h2>Four ways the wrong map shows up</h2><p>These failure patterns look different on the surface. Underneath, they&#8217;re all the same problem.</p><div><hr></div><p><strong>1. All your similarity scores look the same</strong></p><p>You query for something very specific. You query for something very general. The scores come back in the same narrow range regardless (say, 0.62 to 0.79) and don&#8217;t separate relevant from irrelevant.</p><p><em>What&#8217;s happening:</em> your embedding space is compressed. All vectors crowd into the same corner. The metric can&#8217;t discriminate between meaningfully similar and meaningfully different.</p><p><em>Diagnostic:</em> compute cosine similarity on 1,000 random document pairs. Standard deviation below 0.1 = compressed space.</p><p><em>Fix:</em> switch to a model trained with contrastive objectives. This is a geometry problem. A configuration change won&#8217;t fix it.</p><div class="callout-block" data-callout="true"><p>&#8595; Internals: Vector Database is Just the Index</p><p>A common point of confusion for engineers building RAG stacks is separating the embedding model from the vector database (OpenSearch, Pinecone, pgvector, etc).</p><p>Think of your vector database as a high-performance book shelf. Its only job is to index, store, and look up coordinates as fast as possible using approximate nearest neighbor (ANN) algorithms like HNSW or IVF-PQ. It doesn&#8217;t generate the coordinates, and it doesn&#8217;t judge their semantic quality.</p><p>If your embedding model suffers from anisotropy, you are feeding the book shelf broken coordinates. You can tune your database&#8217;s <code>ef_search</code> parameters, payload indexes, or distance metrics all day, but you are ultimately just optimizing how fast the system retrieves nearby noise. The database is calculating the math perfectly; it&#8217;s the underlying geometry that is wrong.</p></div><div><hr></div><p><strong>2. The same word retrieves documents from the wrong category</strong></p><p>&#8220;Pipeline&#8221; returns sales content when the user asked about data infrastructure. &#8220;Incident&#8221; mixes engineering postmortems/CoE&#8217;s with support tickets. Domain terms with one specific meaning in your organization keep returning documents from a different context.</p><p><em>What&#8217;s happening:</em> concept collision. The model placed these terms at their web-scale coordinates, where they overlap multiple meanings. Your organization uses them narrowly; the training data used them broadly.</p><p><em>Diagnostic:</em> for your five most domain-specific terms, manually inspect the top 10 retrieved documents. If a single-meaning term consistently returns two or more semantic categories, you have concept collision.</p><p><em>Fix:</em> fine-tuning with pairs from your corpus. The training process will encounter these collision cases naturally and learn to separate them.</p><div><hr></div><p><strong>3. The same few documents appear in every result set</strong></p><p>Across many different queries, five or ten documents keep showing up in your results. Not always obviously relevant. I once watched a model return the same three runbooks for every &#8220;incident&#8221; query. Scores looked great, all above 0.75. Turned out those runbooks were long, covered every keyword, and sat geometrically near the center of the cluster. Classic hub. One line of code fixed half the pain.</p><p><em>What&#8217;s happening:</em> hubness. In high-dimensional compressed spaces, some vectors land near the geometric center of the distribution. These documents become close neighbors to almost everything. Not because they&#8217;re relevant, but because of where they sit geometrically.</p><p><em>Diagnostic:</em> log which document IDs appear in top-10 results across 100 diverse queries. A small number appearing more than 20 times is a hub problem.</p><p><em>Quick fix:</em> </p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;add36fb8-b554-4c6d-b246-d06a2a76dc85&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">embeddings -= embeddings.mean(axis=0) </code></pre></div><p>before indexing. Pushes hubs away from the center. Often produces immediate improvement.</p><p><em>Proper fix:</em> switch to or fine-tune a model with better geometry.</p><div><hr></div><p><strong>4. Scores look fine but answers are wrong</strong></p><p>Cosine scores are in a reasonable range. The LLM generates coherent text. But the answers are consistently, subtly wrong in ways no automated metric catches.</p><p>This is the hardest one to diagnose because everything in your pipeline <em>looks</em> healthy.</p><p><em>What&#8217;s happening:</em> the model navigated confidently to the nearest coordinates it knows, which aren&#8217;t where you needed to go. The score is high because these really are the nearest neighbors on the map. They&#8217;re just neighbors on the wrong map.</p><p><em>Diagnostic:</em> build 50 to 100 (query, relevant document, irrelevant document) triplets, manually curated. Measure NDCG@10. Below 0.5 on domain queries is a clear signal.</p><p><em>Fix:</em> domain adaptation. Fine-tuning, or switching to a purpose-built model for your vertical.</p><div class="callout-block" data-callout="true"><p><strong>&#8595; Internals: The Telemetry of a Blind Spot</strong></p><p>When your embedding model drops the LLM into an unmapped &#8220;blank space&#8221; on the map, the LLM is forced to rely on its own pre-trained web priors to bridge the gap. This is the exact failure mode that RAGAS isolates.</p><p>While your vector database registers a confident 0.79 cosine similarity score, RAGAS measures <strong>Faithfulness</strong> (checking if the generated response is strictly inferable <strong>only</strong> from the retrieved text). If your embedding space is suffering from a massive domain shift, your cosine scores stay high, but your Ragas Faithfulness scores drop to zero - proving that a high similarity score on a broken map is just a measurement of nearby noise.</p></div><div><hr></div><h2>Drawing a better map</h2><p>Three steps. In order. Every time.</p><p><strong>Step one: your own labeled eval set is the only reliable compass.</strong></p><p>Build 100 to 200 labeled triplets from your corpus: real queries, real relevant documents, real irrelevant documents, curated by someone who understands your domain. Run your current model. Measure NDCG@10. (Several python library already provides out of the box NDCG calculation).</p><p>This number is your baseline. Not MTEB. Not cosine scores. This number, on your data. If NDCG@10 is above 0.6, your embedding model probably isn&#8217;t the bottleneck. Check chunking strategy, query preprocessing, and reranking first.</p><div class="callout-block" data-callout="true"><p>If you want to automate this end-to-end rather than calculating NDCG manually, production frameworks like <strong>RAGAS</strong> allow you to programmatically compute metrics like <em>Context Recall</em> and <em>Context Precision</em> to catch these mapping failures automatically.</p></div><p><strong>Step two: check if a domain-specific model exists.</strong></p><p>For finance, biomedical, and legal: purpose-built models exist and have been empirically validated. FinBERT, PubMedBERT, legal-BERT variants. Start there and measure on your data.</p><p>For everything else: look at the MTEB <em>retrieval</em> sub-score specifically, not the overall average. The overall average blends in clustering and classification tasks that don&#8217;t predict RAG performance. A model that scores 55 on retrieval and 70 overall beats one scoring 45 on retrieval and 75 overall, every time.</p><p><strong>Step three: when to fine-tune.</strong></p><p>Fine-tune if: NDCG@10 is below 0.5, no domain model exists, and you have at least 1,000 to 5,000 (query, relevant document) pairs available.</p><p>No labeled pairs? Generate them synthetically: prompt an LLM to produce 3 to 5 realistic queries per document in your corpus, then mine hard negatives using your current embedding model. Teams using this approach on specialized domains consistently see 10 to 30% NDCG improvement.</p><p>Don&#8217;t fine-tune if: you have fewer than 500 training pairs, or your measurement says the problem is elsewhere.</p><p>One strong opinion: if your domain has heavy internal jargon, proprietary ontology, or vocabulary that means something different inside your company than it does on the web, start with fine-tuning or a domain model. </p><p>Attempting to bridge this gap purely with prompt engineering and general-purpose embeddings introduces tech-debt that scales linearly with your data complexity.</p><div><hr></div><p><strong>A few gotchas before you go.</strong></p><p>High-cardinality repetitive data (think: millions of similar support tickets) makes hubness worse even after fine-tuning. The &#8220;same few documents everywhere&#8221; problem is structural when your corpus has too many near-duplicates.</p><p>Multilingual or code-heavy domains are harder than they look. A fine-tune on English financial text won&#8217;t help your Japanese runbooks or Python error traces.</p><p>And if your domain evolves fast, a startup changing its product every quarter for example, fine-tuned embeddings drift quickly. Re-training cadence and embedding versioning become a day-2 problem many teams underestimate. Weigh that cost before committing.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dX48!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dX48!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!dX48!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!dX48!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!dX48!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dX48!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png" width="1280" height="680" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:680,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3487954,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/197957601?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dX48!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 424w, https://substackcdn.com/image/fetch/$s_!dX48!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 848w, https://substackcdn.com/image/fetch/$s_!dX48!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 1272w, https://substackcdn.com/image/fetch/$s_!dX48!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F72a65358-70ae-432e-8fb2-84671dcd57c0_1280x680.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><blockquote><p><strong>&#8595; Internals: how fine-tuning reshapes the map</strong></p><p>Fine-tuning uses <em>contrastive training</em>: you give the model (query, relevant document, hard negative) triplets and train it to pull relevant pairs closer while pushing hard negatives apart.</p><p>The objective function is called InfoNCE. It frames learning as: given a query and a batch of documents, identify the correct one. The <em>temperature</em> parameter (&#964;, typically 0.05) controls how sharply the model focuses on near-misses, the documents that look like the right answer but aren&#8217;t. Low temperature = sharp focus on the hardest confusions in your corpus.</p><p>Hard negatives from your corpus are what make fine-tuning work. They teach the model the exact distinctions your users encounter, not generic web-scale distinctions. This is why in-domain hard negatives produce dramatically better results than random negatives.</p><p>The geometry side effect: contrastive training spreads vectors more uniformly across the space, reducing anisotropy. A good fine-tune with in-domain hard negatives fixes three of the four failure modes simultaneously: score compression, concept collision, and hubness.</p></blockquote><div><hr></div><h2>What survives</h2><ul><li><p>Embedding models are coordinate systems, not reasoning engines. The map was drawn from the training corpus. It won&#8217;t redraw itself for your data.</p></li><li><p>Your cosine score measures distance on that map. A confident-looking score only tells you those two points are close together. It says nothing about whether the map covers the territory your users are asking about.</p></li><li><p>Your own labeled eval set, built from your data, is the only reliable compass. Every other decision follows from that measurement.</p></li></ul><div><hr></div><p><em>INTERNALS.md is a technical series on how production AI systems actually work. No tutorials. No framework evangelism. Just the layer beneath.</em></p><p><em>If this was useful, the best thing you can do is share it with one engineer who would care.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://internals.laxmena.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share INTERNALS.md&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://internals.laxmena.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share INTERNALS.md</span></a></p><div><hr></div><h2>Sources</h2><ul><li><p><a href="https://arxiv.org/abs/2109.04404">All Bark and No Bite: Rogue Dimensions in Transformer Language Models</a>, Timkey &amp; van Schijndel, 2021</p></li><li><p><a href="https://aclanthology.org/2021.emnlp-main.552/">SimCSE: Simple Contrastive Learning of Sentence Embeddings</a>, Gao, Yao &amp; Chen, EMNLP 2021</p></li><li><p><a href="https://arxiv.org/abs/2005.10242">Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere</a>, Wang &amp; Isola, 2020</p></li><li><p><a href="https://lilianweng.github.io/posts/2021-05-31-contrastive/">Contrastive Representation Learning</a>, Lilian Weng</p></li><li><p><a href="https://arxiv.org/abs/2409.18511">Do We Need Domain-Specific Embedding Models?</a>, Tang et al. (FinMTEB), 2024</p></li><li><p><a href="https://arxiv.org/abs/2311.18364">Hubness Reduction Improves Sentence-BERT Semantic Spaces</a>, 2023</p></li><li><p><a href="https://weaviate.io/blog/fine-tune-embedding-model">Why, When and How to Fine-Tune a Custom Embedding Model</a>, Weaviate</p></li><li><p><a href="https://www.sbert.net/docs/package_reference/util/hard_negatives.html">Hard Negative Mining</a>, sentence-transformers documentation</p><div><hr></div><p></p></li></ul><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://internals.laxmena.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading INTERNALS.md! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[What you're actually writing when you write a SKILL.md]]></title><description><![CDATA[INTERNALS.md #2 &#183; Skills are programs, not prompts. How the skills runtime actually loads, and why the architecture is everything.]]></description><link>https://internals.laxmena.com/p/what-youre-actually-writing-when</link><guid isPermaLink="false">https://internals.laxmena.com/p/what-youre-actually-writing-when</guid><dc:creator><![CDATA[Lax Meiyappan]]></dc:creator><pubDate>Thu, 30 Apr 2026 14:30:30 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kikC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kikC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kikC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 424w, https://substackcdn.com/image/fetch/$s_!kikC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 848w, https://substackcdn.com/image/fetch/$s_!kikC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 1272w, https://substackcdn.com/image/fetch/$s_!kikC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kikC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:338043,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/195955352?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kikC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 424w, https://substackcdn.com/image/fetch/$s_!kikC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 848w, https://substackcdn.com/image/fetch/$s_!kikC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 1272w, https://substackcdn.com/image/fetch/$s_!kikC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57dcbe69-b354-4bfb-b20f-897b352cd71e_3200x1680.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>A skill is a small program. </p><p>It has three execution stages: 1\ what loads every turn, 2\ what loads on invocation, and 3\ what loads on demand. Because a skill is a program, it suffers from typical software rot&#8212;environment drift, version sensitivity, and silent, non-reproducible failures.</p><p>You&#8217;ll see these failures in specific shapes. A skill that cost 20% of your context window, silently, before the agent did any work. A skill that worked perfectly until you shared it with a teammate, and ran the build in the wrong directory. A skill tuned carefully on one model, producing worse output the moment you upgraded to a better one.</p><p>These aren&#8217;t separate bugs. They&#8217;re four faces of the same misunderstanding: treating a loader specification like a prompt.</p><p>This post is about what skills actually are underneath, and why understanding the runtime changes everything you do at the surface.</p><p>A note on scope. Skills aren&#8217;t a Claude-only thing anymore. Anthropic <a href="https://agentskills.io/">published the SKILL.md format as an open standard</a> in December 2025, and the same files now work across Claude Code, Kiro, Cursor, Codex CLI, and others. The mental model in this post applies to all of them. I&#8217;ll use <em>Claude</em> as shorthand for the agent harness reading the skill. Swap in your runtime of choice.</p><div><hr></div><h2>What skills are not</h2><p>The first time I wrote a Skill, I thought I was writing a long prompt the agent would consult.</p><p>I wrote one big SKILL.md. Maybe 1,200 lines. Workflow at the top, a map of every module in our codebase, example code, message contracts between services, framework-specific patterns, and at the bottom a list of every gotcha I knew. It worked. It also consumed about 20% of the context window before the agent did any actual work.</p><p>I rewrote it. Same instructions, same output, different architecture: a 180-line SKILL.md that pointed at three reference files and one helper script. The new version cost 7%.</p><p>The instructions didn&#8217;t change. The architecture did. That&#8217;s where the 3&#215; difference lived, and it was the first sign that I was not, in fact, writing a long prompt.</p><p>A prompt is static text. You write it, you ship it, the model reads all of it on every turn. Skills don&#8217;t work like that. Skills are a <em>loader specification</em>. You&#8217;re describing what should be in context, when, and at what cost. The text matters, but the structure decides what survives the trip into the model&#8217;s working memory.</p><p>That reframe is the whole post. Everything else falls out of it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aNlX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aNlX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 424w, https://substackcdn.com/image/fetch/$s_!aNlX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 848w, https://substackcdn.com/image/fetch/$s_!aNlX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 1272w, https://substackcdn.com/image/fetch/$s_!aNlX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aNlX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png" width="1456" height="707" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:707,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:187206,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/195955352?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aNlX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 424w, https://substackcdn.com/image/fetch/$s_!aNlX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 848w, https://substackcdn.com/image/fetch/$s_!aNlX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 1272w, https://substackcdn.com/image/fetch/$s_!aNlX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6cc1e5d9-6440-4fef-864c-1a16f3c72d04_2800x1360.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>A real skill restructure. Same task, same model, same output. The 3&#215; difference came from where the instructions lived, not what they said.</em></p><div><hr></div><h2>The runtime</h2><p>Skills run on a principle Anthropic calls <em>progressive disclosure</em>. The official documentation defines it plainly:</p><blockquote><p>Skills can contain three types of content, each loaded at different times.</p></blockquote><p>This is why two skills with identical instructions can behave completely differently. One loads 180 lines on demand; the other dumps 1,200 lines every turn.</p><p>Anthropic built these levels to protect your context window. If a skill front-loads everything, it crowds out the conversation history and tool outputs. By using progressive disclosure, you stop paying for &#8220;just in case&#8221; instructions and only pay for &#8220;just in time&#8221; execution.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sG51!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sG51!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 424w, https://substackcdn.com/image/fetch/$s_!sG51!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 848w, https://substackcdn.com/image/fetch/$s_!sG51!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 1272w, https://substackcdn.com/image/fetch/$s_!sG51!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sG51!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png" width="1456" height="898" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:898,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:188099,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/195955352?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sG51!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 424w, https://substackcdn.com/image/fetch/$s_!sG51!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 848w, https://substackcdn.com/image/fetch/$s_!sG51!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 1272w, https://substackcdn.com/image/fetch/$s_!sG51!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2fc8e5f5-b367-4e8e-9ae4-1024ba992ee4_2800x1726.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>Each level loads at a different time, with a different cost. Most authors put everything at Level 2.</em></p><p><strong>Level 1: Metadata.</strong> The <code>name</code> and <code>description</code> from YAML frontmatter. Always loaded, every turn. The <a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview">official docs</a> put this at roughly 100 tokens per skill installed. The agent uses the description to decide whether the skill is relevant. It&#8217;s a routing decision, not a usage decision. This is the most important level to get right. If the description is wrong, nothing else matters.</p><p><strong>Level 2: SKILL.md body.</strong> The procedural instructions. Loaded only when the agent decides the skill applies, by reading the file via bash. Anthropic&#8217;s <a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices">best practices documentation</a> puts the recommended ceiling at 500 lines. This is where most people pile on content they shouldn&#8217;t.</p><p><strong>Level 3: References and scripts.</strong> Bundled files referenced from SKILL.md. References are markdown the agent reads only when the body points to them. Scripts are executable code the agent <em>runs</em>: output enters context, the source code does not. Effectively unlimited.</p><p>The Anthropic engineering team (Barry Zhang, Keith Lazuka, and Mahesh Murag) described it in <a href="https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills">their October 2025 announcement</a> as: <em>&#8220;Like a well-organized manual that starts with a table of contents, then specific chapters, and finally a detailed appendix, skills let Claude load information only as needed.&#8221;</em></p><p>Get the architecture right and your skill costs almost nothing until it earns its place. Get it wrong and you pay every turn.</p><div><hr></div><h2>Mental model</h2><p>Picture a kitchen during dinner service.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-ur6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-ur6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 424w, https://substackcdn.com/image/fetch/$s_!-ur6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 848w, https://substackcdn.com/image/fetch/$s_!-ur6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 1272w, https://substackcdn.com/image/fetch/$s_!-ur6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-ur6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png" width="1456" height="770" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:770,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152770,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/195955352?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-ur6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 424w, https://substackcdn.com/image/fetch/$s_!-ur6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 848w, https://substackcdn.com/image/fetch/$s_!-ur6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 1272w, https://substackcdn.com/image/fetch/$s_!-ur6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd624a878-3b09-46e6-a741-dc0c707de3d3_2800x1480.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>The chef&#8217;s attention is the scarce resource. Same as the agent&#8217;s context window.</em></p><p>There&#8217;s a pinboard on the wall with recipe titles and one-line summaries. <em>Pasta Carbonara: Italian classic, use when guest wants creamy pasta with bacon.</em> The chef glances at it constantly. It&#8217;s small enough to hold in peripheral vision. That&#8217;s frontmatter.</p><p>When a guest orders, the chef picks the matching card and pulls down the full recipe. Ingredients, steps, technique notes. The recipe is not on the wall. It would be too cluttered, too distracting, too much to scan during service. It comes down only when needed. That&#8217;s SKILL.md.</p><p>The recipe sometimes says <em>for the sauce, see Sauce Reference, page 47</em>. The chef walks to the binder, opens to page 47, reads only that page. Doesn&#8217;t read the whole binder. That&#8217;s <code>references/</code>.</p><p>In the corner, a stand mixer. The recipe says <em>use the mixer for three minutes</em>. The chef does not read the mixer&#8217;s circuit diagram. The chef hands it ingredients, presses a button, gets output. That&#8217;s <code>scripts/</code>.</p><p>The metaphor holds under pressure, which is the only test of a metaphor. Every failure mode I hit in my own skills traces back to violating the kitchen.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://internals.laxmena.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"><strong>Get the Internals!</strong> <em>Deep dives for the engineers who skip the tutorials and go straight to the source code.</em></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h2>The Antipattern Ledger</h2><p>When I first started migrating my workflows to the <code>SKILL.md</code>, I treated the runtime like a smart intern who could &#8220;just figure it out.&#8221;</p><p>I was wrong. Because the skills runtime is a deterministic loader, minor architectural choices&#8212;like where you put a single line of YAML&#8212;can silently break the agent&#8217;s reasoning. These aren&#8217;t just bugs; they are <strong>antipatterns</strong>. Each one below represents a moment where I violated the &#8220;Kitchen&#8221; logic and paid for it in context drift, high latency, or hallucinated outputs.</p><h3>Frontmatter on reference files</h3><p>The first thing I got wrong, before I understood progressive disclosure existed.</p><p>I added YAML frontmatter to my reference files because SKILL.md had it, and the references felt important enough to deserve metadata. I didn&#8217;t realize what frontmatter actually does.</p><p>Frontmatter is what gets loaded into the system prompt at startup. Every file with frontmatter contributes its <code>name</code> and <code>description</code> to the always-loaded set. The pinboard. Adding frontmatter to a reference file pins it to the wall as if it were a top-level skill. It isn&#8217;t. Now the pinboard shows fifty entries instead of five, most of them sub-pages that were never meant to be visible at routing time.</p><p>In practice: the agent would occasionally trigger a reference file directly instead of the parent skill. Instructions out of context, without the skill body that gave them meaning. The output was subtly wrong and I couldn&#8217;t figure out why, because the reference file looked fine in isolation. I didn&#8217;t realize it had been promoted to skill-level visibility.</p><p>The fix was one line per file: delete the frontmatter from references. They&#8217;re not skills. They&#8217;re chapters that other skills point to.</p><h3>One monolithic skill</h3><p>This is the 20%-to-7% story.</p><p>When I built a skill to capture context across multiple modules and message systems, I put everything in one SKILL.md. It seemed cleaner. One file, one source of truth. Easy to read, easy to edit.</p><p>It also meant that every time the skill triggered, the agent loaded the entire 1,200-line file. Module map, contracts, patterns, and gotchas. Even when the task only needed two of those four.</p><p>Splitting it into a 180-line spine with three reference files dropped context consumption from 20% to 7%. Same task, same output, same model.</p><p>This compounds. A skill that costs 7% instead of 20% means you can install three of them in the same context budget, run longer sessions before compaction, hit fewer cliffs on long-horizon tasks. The savings aren&#8217;t local. They show up everywhere downstream.</p><h3>Hardcoded workspace paths</h3><p>I shared a skill with a teammate and it ran the build command in the wrong directory.</p><p>My instructions said something like <em>navigate to </em><code>modules/web</code><em> and run the build</em>. That worked in my repo. My teammate&#8217;s repo had four modules. <code>modules/web</code> didn&#8217;t exist; they had <code>packages/frontend/web</code>. The skill silently picked the wrong directory and produced output in the wrong place. No error. Just wrong output.</p><p>The fix was to write instructions that ask the agent to <em>discover</em> the right path rather than declare it. Search for the build configuration. Identify the module by its <code>package.json</code>. Read the workspace structure before assuming. The skill became more abstract, but it became portable.</p><p>This is the failure mode that doesn&#8217;t appear until you share. If you only ever run a skill on your own machine, you can hardcode anything and it will work. The moment another engineer runs it, every implicit assumption surfaces as a bug.</p><h3>Missing gotchas</h3><p>My monorepo uses Turborepo. The build command has to run from the repo root for the configuration to resolve correctly. If you run <code>build</code> from inside a module directory, the build still runs. But the cache misses, the dependency graph gets misread, and the output is subtly wrong.</p><p>The agent&#8217;s default was reasonable: <em>I&#8217;m working in the </em><code>web</code><em> module, so I&#8217;ll run the build from the </em><code>web</code><em> module.</em> That&#8217;s correct in 90% of repos. It was wrong in this one.</p><p>No amount of &#8220;explain the why&#8221; in the instructions would have prevented it. The wrongness wasn&#8217;t conceptual; it was environmental. The agent&#8217;s prior was correct on average. My environment wasn&#8217;t average.</p><p>The fix was a single line in a Gotchas section: <em>Always run </em><code>turbo build</code><em> from the repository root, never from inside a module.</em> One line. The next time the agent reached for the build command, it consulted the gotcha and ran correctly.</p><p>This is what Gotchas are for. The agent has reasonable defaults. Your environment isn&#8217;t average. That gap is the whole job of the Gotchas section, and it&#8217;s why mature skills treat it as the most important section to maintain over time.</p><h3>Not knowing why the skill worked at all</h3><p>The deepest mistake. I didn&#8217;t write evals.</p><p>I built a writing skill for my personal Claude desktop. It was based on <a href="https://laxmena.com/the-day-you-became-a-better-writer">Scott Adams&#8217; writing principles</a>: short sentences, active voice, front-loaded points, one idea per paragraph. I tuned it on Sonnet 4.6. It worked exactly the way I wanted: drafts came out clean, direct, in my voice.</p><p>Then I upgraded to Opus. Better model, I assumed. Better output.</p><p>The output was worse. Every sentence ran 5 to 7 words. Technically short. But choppy. No rhythm, no flow, nothing that read like me. The writing felt like bullet points dressed as prose.</p><p>What happened is subtle. Sonnet read &#8220;write short sentences&#8221; and applied judgment: short where brevity sharpened the point, longer where the rhythm needed it. It understood the spirit. Opus read the same instruction and followed it literally. Every sentence, hard constraint, no exceptions.</p><p>The more capable model has stronger priors about what &#8220;good writing&#8221; looks like. Its version of clear prose is the statistical center of good writing on the internet. My voice isn&#8217;t the statistical center. Opus pulled hard toward its own aesthetic, and away from mine.</p><blockquote><p><em>A skill tuned on one model is calibrated to that model&#8217;s compliance characteristics, not just its capabilities.</em></p></blockquote><p>A more capable model isn&#8217;t automatically a better fit. Sometimes it&#8217;s worse, because it interprets your instructions instead of following them.</p><p>I had no evals. No way to know how much had drifted, which instructions were being over-applied, or what a passing output even looked like quantitatively. I&#8217;d never defined what &#8220;sounds like me&#8221; meant in terms a test could check.</p><p>Anthropic&#8217;s <code>skill-creator</code>, the tool the team uses to build their own skills internally, has an explicit eval methodology. The core move is <em>paired runs</em>: for every test prompt, run the agent twice. Once with the skill, once without. You&#8217;re not measuring whether the output is good. You&#8217;re measuring whether it&#8217;s <em>better than baseline</em>, and by how much.</p><p>For a writing skill, not all assertions are scriptable. But some are: output length, sentence count, average sentence length, readability score. The rest is structured human review, with the previous output alongside the new one and a notes field. That&#8217;s what Anthropic&#8217;s <code>eval-viewer</code> in <code>skill-creator</code> produces.</p><p>I now keep a small 'Golden Set' per skill&#8212;a practice we&#8217;ll dissect in an upcoming post on automated skill validation&#8212;to ensure my voice doesn't drift when the underlying model changes. Three or four realistic prompts. Rerun the suite on every model bump, every skill edit. Check the deltas.</p><p><em><strong>It worked when I tested it</strong></em><strong> is not evidence. It&#8217;s the absence of measurement.</strong></p><div><hr></div><h2>What survives the post</h2><p>Four things should stick.</p><p><strong>Skills are loader specifications, not prompts.</strong> Frontmatter is a routing mechanism. SKILL.md is a triggered payload. References and scripts are deferred chapters. Once you see the architecture, every authoring decision becomes a question of <em>which level does this content belong at?</em></p><p><strong>Architecture decides cost.</strong> The same instructions, in the wrong shape, can consume 3&#215; the context window. That penalty compounds across every skill installed and every turn taken. The fix is structural, not prose-level.</p><p><strong>The agent has reasonable priors. Your environment doesn&#8217;t.</strong> Gotchas exist because the model&#8217;s defaults are correct on average and your situation isn&#8217;t average. Workspace paths, build systems, team conventions: none of it lives in the model&#8217;s training. It has to live in the skill.</p><p><strong>A model upgrade is not free.</strong> A skill tuned on one model is calibrated to that model&#8217;s compliance characteristics. A more capable model interprets your instructions instead of following them, and for skills that encode personal or organizational voice, that interpretation is the failure. The only way to know if an upgrade helped or hurt is to measure it.</p><div><hr></div><p><em>INTERNALS.md is a technical series on how production AI and Data systems actually work. No tutorials. No framework evangelism. Just the layer beneath.</em></p><p><em>If this was useful, the best thing you can do is share it with one engineer who&#8217;d care.</em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://internals.laxmena.com/p/what-youre-actually-writing-when?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://internals.laxmena.com/p/what-youre-actually-writing-when?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div><hr></div><h2>Sources</h2><ul><li><p><a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview">Agent Skills overview</a>, Claude API documentation</p></li><li><p><a href="https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices">Agent Skills best practices</a>, Claude API documentation</p></li><li><p><a href="https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills">Equipping agents for the real world with Agent Skills</a>, Barry Zhang, Keith Lazuka, Mahesh Murag, Anthropic Engineering, October 2025</p></li><li><p><a href="https://github.com/anthropics/skills/blob/main/skills/skill-creator/SKILL.md">skill-creator/SKILL.md</a>, Anthropic skills repository</p></li><li><p><a href="https://agentskills.io/">Agent Skills open standard</a>, December 2025</p></li><li><p><a href="https://laxmena.com/the-day-you-became-a-better-writer">The Day You Became a Better Writer</a>, Lakshmanan Meiyappan</p></li><li><p><a href="https://web.archive.org/web/20240302003157/https://dilbertblog.typepad.com/the_dilbert_blog/2007/06/the_day_you_bec.html">Scott Adams&#8217; original post</a>, via Internet Archive</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Most people misunderstand LangGraph. Here’s what it actually is]]></title><description><![CDATA[Internals.md #1 - A breakdown of how LangGraph works under the hood&#8212;and how to think about it in real systems.]]></description><link>https://internals.laxmena.com/p/langgraph-internals-how-production</link><guid isPermaLink="false">https://internals.laxmena.com/p/langgraph-internals-how-production</guid><dc:creator><![CDATA[Lax Meiyappan]]></dc:creator><pubDate>Sat, 18 Apr 2026 21:30:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!NfQt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NfQt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NfQt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NfQt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NfQt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NfQt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NfQt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2222345,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/194642827?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NfQt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!NfQt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!NfQt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!NfQt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffda0e620-0d59-4bc9-be95-d09ebab8bbdd_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Your agent worked yesterday. Today it&#8217;s returning wrong answers. No stack trace. No failed tool call. No exception. Just quietly incorrect output, shipping to production.</p><p>Here&#8217;s what probably happened. You added a node that writes to the same state key as an existing one, and LangGraph&#8217;s execution engine ran both in parallel. One write clobbered the other. The state is corrupt. Your agent looks fine because technically, it <em>is</em> running. It&#8217;s just running on a lie.</p><p>This post is about the engine that makes that possible and the one decision that prevents it.</p><div><hr></div><h2>Why Graphs Won the Agent Runtime</h2><p>DAGs (Directed Acyclic Graphs) are elegant when data flows in one direction. The moment an agent needs to retry, re-plan, or loop through tool calls until some condition holds, they fall apart. A ReAct loop is not a DAG. It&#8217;s a cycle.</p><p>And cycles are what production agents actually do.</p><p>LangGraph&#8217;s answer is to model agents as cyclic graphs with typed state. Nodes compute. Edges - including conditional ones decide where execution goes next. State carries everything across steps. This isn&#8217;t a convenience; it&#8217;s the minimum structure that lets an agent loop without losing what it learned on the last pass.</p><p>The tradeoff is explicitness. You declare the schema up front. You declare the edges. The graph is compiled before it runs. In return, you get something most frameworks can&#8217;t give you: a runtime that pauses, resumes, replays, and parallelizes cleanly because the engine knows, precisely, what the graph is.</p><div><hr></div><h2>The Real Primitives: Actors and Channels</h2><p>The public API shows you <code>StateGraph</code>, nodes, and edges. The engine underneath doesn&#8217;t work that way.</p><p>Before we go technical, here&#8217;s the mental model that makes everything else click.</p><blockquote><h4>&#128161; Mental Model: The Autonomous Pancake House</h4><p>Stop thinking of a graph as a Boss shouting orders at Employees (function calling). Instead, imagine a kitchen that runs entirely on mailboxes (message passing).</p><ul><li><p><strong>The Channels (Mailboxes):</strong> There is a &#8220;Batter&#8221; mailbox and a &#8220;Plates&#8221; mailbox. They aren&#8217;t just boxes, they have rules written on the lid.</p></li><li><p><strong>The Actors (Specialized Cooks):</strong> The Fryer cook doesn&#8217;t wait for a command. She sits by the &#8220;Batter&#8221; mailbox. The moment batter appears, she wakes up, cooks it, and drops the result into the &#8220;Plates&#8221; mailbox.</p></li><li><p><strong>The Reducer (The Lid Rule):</strong> What if two cooks drop a pancake onto the same plate at once?</p><ul><li><p><em>No reducer:</em> The plate shatters; it only expects one item. (<code>InvalidUpdateError</code>)</p></li><li><p><em>With reducer (</em><code>operator.add</code><em>):</em> The rule on the lid says &#8220;Stack them.&#8221; Both pancakes land, and breakfast is saved.</p></li></ul></li></ul><p>In this kitchen, no one talks to each other. They only talk to the mailboxes. This is why LangGraph can pause, resume, or run ten cooks at once; the state is in the mailboxes, not in the cooks&#8217; heads.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VfVk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VfVk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!VfVk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!VfVk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!VfVk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VfVk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png" width="1456" height="1294" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1294,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:290019,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/194642827?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VfVk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!VfVk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!VfVk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!VfVk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6149cb6-75ed-4095-a322-5769dc6c2907_2700x2400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That's the mental model. Here's what it maps to in the engine.</p><p>Under the hood, LangGraph runs on a model borrowed from <a href="https://15799.courses.cs.cmu.edu/fall2013/static/papers/p135-malewicz.pdf">Google&#8217;s Pregel paper</a>. Its real primitives are <strong>actors</strong> and <strong>channels</strong>. Actors called <code>PregelNode</code> internally subscribe to channels, read from them, and write to them. Channels hold values. Reducers decide how those values update when multiple writes arrive in the same step.</p><p>Here&#8217;s the reframing that matters: a state key in your <code>StateGraph</code> <em>is</em> a channel. A reducer is that channel&#8217;s update function. When you annotate a field with <code>operator.add</code>, you&#8217;re configuring the channel to append on update. When you leave it unannotated, the channel overwrites - and if two actors write to it in the same step, the engine throws.</p><p>So &#8220;state&#8221; is not a dictionary that nodes mutate. State is a set of channels, each with its own update semantics. Nodes don&#8217;t call each other; they publish to channels. Other nodes subscribe. This is message passing, not function calling.</p><div><hr></div><h2>Supersteps: The Execution Model</h2><p>LangGraph executes in discrete steps called <strong>supersteps</strong>, and each one has three phases:</p><ol><li><p><strong>Plan</strong>: the engine inspects channel state and selects which actors to run</p></li><li><p><strong>Execute</strong>: selected actors run in parallel</p></li><li><p><strong>Update</strong>: writes are merged into channels through the reducers</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SgKb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SgKb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!SgKb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!SgKb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!SgKb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SgKb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png" width="1456" height="1294" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1294,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:276367,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/194642827?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SgKb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!SgKb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!SgKb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!SgKb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F643fb44c-32c3-4b80-9d53-32cb8f853c5d_2700x2400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The crucial property is that a superstep is <strong>transactional</strong>. If any actor in the step raises an exception, the entire step&#8217;s writes are discarded. None of the parallel results land. This isn&#8217;t a bug - it&#8217;s the guarantee that makes checkpointing meaningful. You never observe a half-applied superstep.</p><p>Selection is deterministic, too. An actor fires only when a channel it subscribes to has new data. The engine loops until no actor has pending work, or a step limit is hit. This is Bulk Synchronous Parallel - the same model that powers Apache Spark&#8217;s graph processing layer.</p><p>Two consequences fall out of this design, and both matter at 2am in production.</p><p><strong>First, parallelism is free when nodes are independent.</strong> If two nodes subscribe to the same input channel and write to different channels, they run in the same superstep with no extra configuration. The engine figures it out.</p><p><strong>Second, concurrent writes to the same channel need a reducer.</strong> Without one, the engine has no way to know which write wins - so it refuses to guess.</p><p>That second consequence is where most production bugs live.</p><div><hr></div><h2>The Silent Corruption Problem</h2><p>Here&#8217;s the error the runtime throws when you get it wrong:</p><pre><code><code>InvalidUpdateError: At key 'todos':
Can receive only one value per step.
Use an Annotated key to handle multiple values.
</code></code></pre><p>This error is loud. It fires immediately. You see it, you fix it.</p><p>The silent version is worse.</p><p>If your graph has a single path today and no concurrent writes, the overwrite default works fine. You never see the error. Then weeks later - you add a parallel branch. A fan-out pattern, a subagent, a retry. Suddenly two nodes land writes in the same step. If the bug fires only intermittently, you get something uglier than an exception: wrong answers in production, with no trace of why.</p><p>The fix is one line:</p><pre><code><code>from typing import Annotated
import operator

class AgentState(TypedDict):
    messages: Annotated[list, operator.add]  # append, do not overwrite
    todos: Annotated[list[Todo], operator.add]
</code></code></pre><p>This isn&#8217;t theoretical. It&#8217;s still happening in production frameworks. As of November 2025, running the official research example in <a href="https://github.com/langchain-ai/deepagentsjs/issues/65">deepagentsjs</a> throws this exact error - the <code>todos</code> state key has no reducer, so the underlying channel falls back to <code>LastValue</code>, which refuses concurrent updates. The fix is the same in every case: annotate the channel with a reducer like <code>operator.add</code> so concurrent writes append instead of collide. CopilotKit shipped the identical patch for their LangGraph integration in August 2025 (<a href="https://github.com/CopilotKit/CopilotKit/pull/2276">PR #2276</a>).</p><blockquote><p>&#9889; <strong>If you remember one thing from this post:</strong> every state key that could receive concurrent writes needs a reducer. The question isn&#8217;t whether you have parallel execution today. It&#8217;s whether you might add it next sprint.</p></blockquote><div><hr></div><h2>Compilation: The Step Most Writers Skip</h2><p><code>graph.compile()</code> is not a convenience call - it&#8217;s where LangGraph turns your declarative graph into an executable plan.</p><p>During compilation, the engine does four things:</p><ol><li><p>Validates that every conditional edge routes to a declared node</p></li><li><p>Builds the channel topology from your state schema and reducers</p></li><li><p>Constructs the actual <code>PregelNode</code> objects from your node functions</p></li><li><p>Freezes the graph - the compiled object is immutable</p></li></ol><p>What <code>compile()</code> returns is a <code>Pregel</code> instance. <em>That&#8217;s</em> what you invoke, stream from, and checkpoint. The <code>StateGraph</code> you built was a blueprint. The <code>Pregel</code> is the machine.</p><p>This matters because compilation errors are caught before a single token of inference runs. Return a value from a conditional edge that isn&#8217;t in the edge map, and compilation throws immediately &#8212; something like:</p><pre><code><code>ValueError: Expected one of ['tools', 'end'], got 'tool'
Check your conditional edge return values.
</code></code></pre><p>Typo in an edge name. Zero LLM calls wasted. That&#8217;s the contract <code>compile()</code> offers - and most agent frameworks can&#8217;t give you anything close to it.</p><div><hr></div><h2>Checkpointing Is Write-Ahead Logging</h2><p>Every database engineer already knows how checkpointing works, even if the docs don&#8217;t put it that way. LangGraph&#8217;s checkpointer is write-ahead logging applied to agent state.</p><p>After every superstep, the full channel state is serialized and persisted. On resume, the engine reads the latest checkpoint and continues from the superstep boundary. Simple idea, enormous consequences.</p><p>The persisted object is a <code>Checkpoint</code> - a versioned snapshot containing <code>channel_values</code>, <code>channel_versions</code>, and any pending writes from nodes that succeeded while a sibling was failing. The <code>thread_id</code> is the namespace key: one agent session, one thread. Different threads can&#8217;t see each other&#8217;s state.</p><p>One primitive, four use cases - really the same capability wearing different clothes:</p><ul><li><p><strong>Durable execution</strong> - crash mid-run, resume from the last checkpoint, lose nothing</p></li><li><p><strong>Human-in-the-loop</strong> - interrupt before a node, serialize, wait for approval, resume</p></li><li><p><strong>Time travel</strong> - load any historical checkpoint, replay from there, fork alternate paths</p></li><li><p><strong>Partial failure recovery</strong> - if one node fails mid-step, the completed parallel writes get stored as pending; on resume, only the failed node re-runs</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xfcb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xfcb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!xfcb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!xfcb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!xfcb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xfcb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png" width="1456" height="1294" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1294,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:421929,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/194642827?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xfcb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!xfcb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!xfcb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!xfcb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2b3caf3-e229-46f3-aebc-7bf7cc5d94df_2700x2400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The Postgres checkpointer shows the mechanics clearly. It maintains four tables. <code>checkpoints</code> holds state snapshots as JSONB. <code>checkpoint_blobs</code> holds large values as binary. <code>checkpoint_writes</code> logs pending writes from mid-step failures. <code>checkpoint_migrations</code> tracks schema versions. Each superstep is an insert; resume is a read of the latest row for a given <code>thread_id</code>.</p><p>The trap is write amplification at scale. A long-running agent with a growing message list writes the full state every superstep. If messages accumulate unboundedly and checkpoints fire every step, checkpoint size grows linearly with conversation length.</p><p>One developer <a href="https://blog.lordpatil.com/posts/langgraph-postgres-checkpointer">documented four-second reads</a> on long threads in their production chatbot - the pickled message history sitting in <code>checkpoint_blobs</code> had grown large enough that loading a conversation meant a database query, a binary download, and a full pickle deserialization before the UI could render.</p><p>The fix is to evict. Summarize old messages, offload large tool results, keep the hot state small. DeepAgents handles this with a <code>SummarizationMiddleware</code> that compacts conversation history once token usage crosses a threshold. That&#8217;s the pattern in one sentence: checkpointing is cheap only if the state stays bounded.</p><div><hr></div><h2>Subgraphs and the State Boundary Problem</h2><p>Every production agent eventually outgrows a single graph. A planner calls an executor. A router dispatches to specialists. A retrieval pipeline feeds into a synthesis step. LangGraph handles this natively - a compiled graph is itself callable as a node inside another graph.</p><p><strong>The mental model:</strong> a subgraph is a reusable block of graph. Think of it like a function - compile once, invoke from anywhere. Same runtime, same checkpointer, same supersteps. When the parent reaches the subgraph node, execution descends into the child, runs to completion, and returns to the parent&#8217;s next superstep.</p><h4>The boundary is where it gets interesting</h4><p>Parent and child have separate state schemas. They have to - otherwise every subgraph would inherit every key from every possible parent, and the schemas would explode.</p><p>When execution crosses the boundary, state must be mapped. LangGraph does this one of two ways:</p><ul><li><p><strong>If the schemas share keys</strong>, state flows through those keys automatically. The parent&#8217;s <code>messages</code> channel wires to the child&#8217;s <code>messages</code> channel. Updates propagate.</p></li><li><p><strong>If the schemas don&#8217;t share keys</strong>, you pass state explicitly at invocation and transform the child&#8217;s output back into the parent&#8217;s shape. This is the boundary that silently breaks.</p></li></ul><p><strong>The failure mode:</strong> you write a subgraph with state key <code>documents</code>. Your parent has state key <code>retrieved_docs</code>. The subgraph runs, writes to <code>documents</code>, returns. The parent&#8217;s <code>retrieved_docs</code> is still empty. No error. No stack trace. Just a synthesis step running on zero documents, producing a confident-sounding but ungrounded answer.</p><p>The fix is explicit mapping:</p><pre><code><code>def call_retrieval(state: ParentState) -&gt; dict:
    result = retrieval_subgraph.invoke({"query": state["user_question"]})
    return {"retrieved_docs": result["documents"]}  # explicit key mapping
</code></code></pre><p>Treat subgraph boundaries like API contracts. Declare them explicitly. Validate the output shape.</p><div><hr></div><blockquote><h3>&#9888;&#65039; Subgraphs &#8800; Subagents</h3><p>Don&#8217;t confuse structural organization with behavioral autonomy.</p><ul><li><p><strong>Subgraphs are about code hygiene.</strong> Nested graphs that keep your main graph from becoming a spaghetti monster. They share the same execution engine and flow state through explicit channel mappings.</p></li><li><p><strong>Subagents are about context isolation.</strong> An autonomous loop with its own context window. It doesn&#8217;t just share state - it filters it, preventing <em>context pollution</em> where the messy reasoning of one specialist confuses the planner.</p></li></ul><p><strong>You use a subgraph when you want to repeat a pattern. You use a subagent when you want to delegate a problem.</strong></p></blockquote><div><hr></div><h2>DeepAgents: The Harness Around the Graph</h2><p>LangGraph gives you the runtime. DeepAgents gives you the architecture.</p><p>A vanilla ReAct loop fails in four predictable ways: shallow planning, context overflow, context pollution, state bloat. DeepAgents is a direct answer to each.</p><p>It&#8217;s an open-source harness from LangChain - a <code>CompiledStateGraph</code> wrapped with four middleware layers, each solving one failure mode:</p><ul><li><p><strong>TodoListMiddleware</strong> - <code>write_todos</code> forces the agent to decompose the task and write the plan into context before acting</p></li><li><p><strong>FilesystemMiddleware</strong> - <code>ls</code>, <code>read_file</code>, <code>write_file</code>, and friends let the agent offload large content to a virtual filesystem. The prompt holds pointers, not pages.</p></li><li><p><strong>SubAgentMiddleware</strong> - <code>task</code> spawns an isolated subagent with its own context window. Only the final result returns. The messy middle stays hidden.</p></li><li><p><strong>SummarizationMiddleware</strong> - watches token usage, compacts history when it crosses a threshold. Keeps checkpoints cheap.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!feVE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!feVE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!feVE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!feVE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!feVE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!feVE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png" width="1456" height="1294" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1294,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:389505,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://internals.laxmena.com/i/194642827?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!feVE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 424w, https://substackcdn.com/image/fetch/$s_!feVE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 848w, https://substackcdn.com/image/fetch/$s_!feVE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 1272w, https://substackcdn.com/image/fetch/$s_!feVE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F720e5369-ce0f-4b84-9733-5d1af4997283_2700x2400.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Notice what DeepAgents <em>doesn&#8217;t</em> invent. Every capability sits on top of LangGraph primitives - channels, reducers, checkpointing, subgraphs. DeepAgents isn&#8217;t a different framework. It&#8217;s a set of opinionated patterns for using LangGraph well at scale.</p><p>That distinction matters more than it looks. If you understand LangGraph&#8217;s internals, DeepAgents reads as a library of workable defaults. If you don&#8217;t, it reads as magic. And magic becomes impossible to debug the first time something breaks.</p><div><hr></div><h2>Five Failure Modes You Will See in Production</h2><p>Every one of these is documented - in GitHub issues, official docs, or production writeups.</p><div><hr></div><p><strong>1. Concurrent write without a reducer.</strong> <code>InvalidUpdateError: At key 'X': Can receive only one value per step.</code> Two actors wrote to the same channel in one superstep without an annotated reducer. The engine refuses to guess which write wins. Fix: <code>Annotated[list, operator.add]</code> or a custom reducer. Sources: <a href="https://docs.langchain.com/oss/python/langgraph/errors/INVALID_CONCURRENT_GRAPH_UPDATE">Official docs</a> &#183; <a href="https://github.com/langchain-ai/deepagentsjs/issues/65">deepagentsjs #65</a></p><p>This is the first bug every multi-agent system ships.</p><div><hr></div><p><strong>2. Empty update from a node.</strong> <code>InvalidUpdateError: Must write to at least one of [...]</code> A node returned nothing. Happens most often with conditional routing nodes that forget to return a payload on certain branches. Fix: Always return an explicit dict - even <code>{"messages": []}</code> satisfies the engine. Sources: <a href="https://github.com/langchain-ai/langgraph/issues/740">langgraph #740</a> &#183; <a href="https://github.com/langchain-ai/langgraph/issues/2644">langgraph #2644</a></p><p>The engine is strict by design &#8212; it would rather throw than guess.</p><div><hr></div><p><strong>3. State bloat at checkpoint time.</strong> Checkpoint reads in seconds, not milliseconds. Unbounded message lists, large tool results stored inline. Every superstep serializes the full channel state. Messages that grow without a cap grow the checkpoint linearly. Fix: Summarize or offload. Keep the hot state small. Source: <a href="https://blog.lordpatil.com/posts/langgraph-postgres-checkpointer">lordpatil, July 2025</a> - four-second reads on a production chatbot, traced to <code>checkpoint_blobs</code>.</p><p>Checkpointing is only free if you treat state as precious. Most people don&#8217;t, until this.</p><div><hr></div><p><strong>4 &amp; 5. Subgraph state silently not flowing / Serialization failures.</strong></p><p>These appear less as sudden crashes and more as slow-burn confusion. Schema mismatch at a subgraph boundary (failure 4) produces no error - just a parent state that never updates, and an agent that silently runs on stale data. Fix: explicit key mapping at the invocation site. (<a href="https://langchain-ai.github.io/langgraph/how-tos/subgraph/">Official subgraph docs</a>)</p><p>Serialization failures (failure 5) fire during <code>put_writes</code> or resume: <code>TypeError: Object of type X is not JSON serializable</code>. The <code>JsonPlusSerializer</code> uses ormsgpack; anything it can&#8217;t encode breaks the checkpoint. Fix: keep only serializable objects in state, or use <code>JsonPlusSerializer(pickle_fallback=True)</code> for DataFrames and custom types. (<a href="https://github.com/langchain-ai/langgraph/issues/3441">langgraph #3441</a> &#183; <a href="https://github.com/langchain-ai/langgraph/issues/5769">langgraph #5769</a>)</p><div><hr></div><p>Each of these is the kind of bug you ship without noticing. Each has a one-line fix once you see it. The difference between a senior engineer and a principal one on agent code is knowing which to check first.</p><div><hr></div><h2>What to Take Away</h2><p>Three things that should survive this post.</p><p><strong>LangGraph is a message-passing runtime with transactional supersteps - not a graph of function calls.</strong> Once you see channels and reducers as the real primitives, everything else follows: parallelism, checkpointing, subgraphs, even DeepAgents.</p><p><strong>Every state key that might receive concurrent writes needs a reducer.</strong> The overwrite default is safe until it isn&#8217;t, and the failure mode is silent corruption.</p><p><strong>DeepAgents is a library of patterns for managing the context budget.</strong> Planning, filesystem offloading, subagent isolation, summarization - four answers to the same underlying question: how do you keep the hot state small while the task stays long?</p><div><hr></div><p><em>Next issue: <strong>Embeddings Internals.</strong> Why cosine similarity gets weird in high dimensions. What contrastive training actually learns. Why a general-web embedding model will silently fail on your domain data - and how to know when to fine-tune versus pick a better base model.</em></p><div><hr></div><p><em>INTERNALS.md is a technical series on how production AI systems actually work. No tutorials. No framework evangelism. Just the layer beneath.</em></p><p><em>If this was useful, the best thing you can do is share it with one engineer who'd care.</em></p><div><hr></div><p></p>]]></content:encoded></item></channel></rss>