Artificial Intelligenceagent-engineeringrlmorchestration

RLMs, Context Rot, and Recursive Orchestrators

By Everett Quebral

Published on: March 22, 2026

Surreal sequence of nested frames and rooms receding inward as a central figure looks through recursive layers

Sharing

RLMs, Context Rot, and Recursive Orchestrators

Part of the Agent Engineering Playbook.

There is a limit to how far you can get by shoving more text into a model and hoping the reasoning stays sharp.

Vendors keep making context windows larger, and that absolutely helps. But anyone who has worked long enough with very large sessions knows the deeper problem: even when the model can technically fit the context, its performance still starts to decay. Relevant details get ignored. Local salience wins over global structure. The session becomes heavy and vaguely stale.

Alex Zhang calls this "context rot," and it is the right term for the phenomenon.

A bright ribbon of structured context passes through a narrow window and comes out as tangled fragments, showing how long sessions can keep the text while losing the shape.

What an RLM Is Trying to Change

In Zhang's blog post and the official RLM repository, a Recursive Language Model is presented as a thin wrapper around a normal model call. Instead of always sending the full prompt and context directly into one completion, the system places the context into an environment and lets the model inspect it, manipulate it, and recursively call itself over subsets of that context.

The official repo describes this as replacing llm.completion(prompt, model) with rlm.completion(prompt, model). That is the important conceptual shift. The model no longer has to swallow the whole world at once. It can interact with the world as an external environment and recurse over it.

In the authors' implementation, the environment is a REPL. The root model can write code, inspect variables, partition large context, and launch sub-calls against smaller pieces. The context becomes something the model works over, not something it must always fully ingest.

That is a much bigger idea than it first appears.

Why This Matters for Agent Systems

Most orchestration discussions focus on task decomposition across agents. RLMs point to a second axis: context decomposition inside inference itself.

Those are related, but not the same.

A planner can break a software task into subproblems. An RLM-style system can break a massive context into subqueries and route the reasoning over smaller slices. The first problem is organizational. The second is cognitive.

That distinction matters because many agent failures that look like planning problems are really context problems. The agent is not always "bad at orchestration." Sometimes it is just thinking through a polluted or overstuffed window.

If You Came Looking for an "RLM Orchestrator"

The useful thing to copy is the architecture, not the label.

I did not find one single canonical system officially named rlm-orchestrator that you can treat as a standard product. What does exist, and what is worth studying, is the RLM approach itself: store large context outside the main prompt, let the model query or transform it through an environment, and make recursive calls part of the inference strategy.

That means a practical "RLM orchestrator" is less likely to be a branded framework than a design pattern:

keep massive context out of the root model's hot path
give the model structured ways to inspect and partition that context
let it recurse on smaller slices
record the trajectory so the process is debuggable

That is the part worth building toward.

Why This Is More Than Retrieval

It is tempting to hear all this and think: fine, so this is just fancy retrieval.

A root model sits at the center of a workbench, pulling slices from a larger context, sending subqueries outward, and bringing notes back for synthesis.

Not quite.

Retrieval systems decide what small set of context to pull into a model. RLMs give the model more freedom to decide how to inspect, partition, and recurse over the context using an environment. Zhang's write-up is explicit about this difference. The model is not just handed chunks; it can manipulate the context and call sub-queries recursively.

That matters for tasks where the decomposition is not obvious ahead of time.

If you already know exactly which documents matter, retrieval may be enough. If the model has to discover the relevant structure while solving the task, a recursive environment can become much more powerful.

Where I Think This Leads

Today, most coding agents still behave like wide-context chat systems with tool use bolted on. Some of them add subagents. Some add ledgers. Some add queues. But the inference model underneath is usually still "keep the thread going and compact when you have to."

I do not think that will be the long-term shape.

I think the systems that win on deep, long-horizon work will start combining two ideas:

First, explicit orchestration of tasks, roles, and handoffs. Second, explicit orchestration of context itself. The model should not merely receive context. It should navigate it.

That is why RLMs feel important even in their early form. They are not just another benchmark trick. They suggest a different substrate for long-running agent systems.

A Useful Standard of Skepticism

At the same time, it is worth staying honest. The RLM work is early. The results are exciting, but they come from a particular research setup, and the engineering tradeoffs are still real. Recursive systems add latency, tracing complexity, and new failure modes. Environments need to be safe. Recursive calls need budgets. Debugging can get harder before it gets easier.

So I would not read RLMs as "the answer is here." I would read them as a strong clue about the direction of the answer.

If you want to see what happens when the orchestration side of this story gets pushed very far in practice, read What Gas Town Is Really Building. Gas Town does not implement RLMs, but it is one of the clearest examples of somebody trying to industrialize multi-agent work instead of merely talking about it.

Stay Tuned

Want to become a Next.js pro?

The best articles, links and news related to web development delivered once a week to your inbox.

BUILT WITH