Model Indirection Layer

Type
Concept
Published
2026-04-19
Aliases
model abstraction, indirection layer, model proxy, application layer
Brief definition

An architectural layer between application code and the model API that decouples business logic from any single provider, prompt format, or model version — turning model swaps and prompt changes into configuration rather than code rewrites.

What it is

The simplest version of an LLM application calls a model API directly: client.complete(prompt) lives inside the same function that handles the user’s request. Santiago Valdarrama describes this as the single largest architectural mistake practitioners make — and proposes the fix as “the simplest trick you can use to improve the architecture of whatever you are building”: don’t talk directly to a model. Add one intermediate layer between application code and the model.

That layer typically owns prompt assembly (combining system instructions, retrieved context, user input), provider routing (Claude vs GPT-4 vs an open model, chosen by task or cost), version pinning, response parsing and validation, and observability hooks. The application calls something like agent.respond(intent, payload); the layer below handles every detail of how that maps onto a specific model call. When the model API changes, when a new provider becomes preferable, or when prompts need iterating, only the layer changes — application logic stays untouched.

The same architectural impulse explains why earlier monolithic LLM tooling has aged poorly. Sam Hogan catalogues a long list — “RAG, GraphRAG, Multi Agent Orchestration, ReAct frameworks, prompt management tools, LLMOps tooling, eval tools, gateways, fine-tuning libraries” — that he argues were obsoleted by the agentic-coding shift. The frameworks that bound applications tightly to their assumptions broke when the assumptions changed; lightweight indirection survived.

Why it matters

The indirection layer is the practical mechanism behind the harness mindset. It is also the difference between a prototype that ships and a system that adapts: a model call buried in business logic costs hours to migrate; the same call behind an interface costs minutes. For teams running production agents, indirection is what makes provider competition usable rather than disruptive.