EP006 — A Small Loop That Acts Like a Deep Model (Looped Reasoning)

Episode 6·May 4, 2026·18 min

The dominant recipe for building capable language models is depth — more layers, more parameters, more distinct transformer blocks. A new mechanistic interpretability paper from Nam, Gromov, Yaida and colleagues looks at what happens when you take a much smaller model and run the same block over and over again in a loop. Inside, the internal state moves through cyclic trajectories that settle into fixed points; the attention heads stay consistent across iterations; each pass through the loop is doing the work that a separate layer would do in a traditional deep model. The cross-domain parallel: an assembly line versus a solo sculptor walking around the same piece of marble many times. The forward question is where capability actually lives — in the parameters, in the computation graph, or in the trajectory through state-space.

Cross-domain connection

A stonemason carving a sculpture by walking around the same block of marble many times, taking off a sliver each pass, never doing the whole thing in one cut. The traditional deep model is the assembly line — a different worker at each station, each performing one specialized operation, the piece moves through once and emerges finished. The looped model is the solo sculptor — one craftsperson, the same hands going around the piece many times, each pass refining what the previous one set up. Holds on the result being comparable, the path differing in resource use, the iterative process doing real work even though no individual pass looks transformative, and the convergence to a finished form being recognizable. Breaks on episodic memory: the sculptor remembers the previous pass and adjusts on purpose; the looped model has no memory of its own iterations, only the running internal state. The forward question for the show: if doing more by doing the same thing repeatedly is competitive with doing more by being deeper, where does capability live in a neural network — in the parameters, in the computation graph, or in the trajectory through state-space?

Concepts introduced

Looped (recurrent) language model architecture as an alternative to depth-stacking
Cyclic trajectories in latent state-space — the internal state swings through a small loop and settles
Fixed point — a state where running the block one more time changes nothing; the model has converged
Attention-head consistency across iterations as the signal that the model is processing, not flailing
Inference-stages-equivalence — each loop pass functionally maps to a layer of a deeper traditional model
Block size and input injection as architectural levers that change the durability of the cyclic pattern

Source paper

Yoonsoo Nam, Andrey Gromov, Sho Yaida, Adriano Pinto-Reichelt, Sara Solla, Tianlin Liu — *A Mechanistic Analysis of Looped Reasoning Language Models* (arXiv 2604.11791, 2026-04-25 per text-script briefing)

EP006 — A Small Loop That Acts Like a Deep Model (Looped Reasoning)

Cross-domain connection

Concepts introduced

Source paper

Subscribe