EP002 — Memory That Slowly Turns (MemEvoBench)

Episode 2·April 30, 2026·18 min

Last episode we talked about keeping AI agents from being attacked. Today we look at the failure mode that emerges when no one is attacking the agent at all — when the agent's own memory drifts over time through accumulated biased input. Memory misevolution as a path-dependent phenomenon, with institutional drift and the Overton window as the cross-domain parallel.

Cross-domain connection

Institutional drift and path-dependent belief formation. Individuals, institutions, and media ecosystems do not fail by being persuaded by a single compelling falsehood — they fail by accumulating biased inputs whose individual weight is insufficient but cumulative effect shifts the evaluative baseline. Universally recognizable structural pattern (Overton window, mission drift, repeated-exposure belief formation). Holds on the drift-mechanism shared across substrates; breaks on corrective mechanisms (institutions have audit, peer pressure, contradiction; default LLM agents have none).

Concepts introduced

Persistent agent memory (the "notebook" framing — defined here for audio audience even though it also appeared in text EP003)
Memory misevolution as a NAMED phenomenon (drift through accumulated biased inputs, not single-prompt attack)
Three attack vectors: adversarial memory injection, noisy tool outputs, biased feedback (briefly named, not exhaustively detailed)
Path-dependence — what counts as plausible at time T is set by inputs at T-1 (institutional analogy carries this)
Multi-turn vs. single-turn agent evaluation as a methodological gap

Source paper

Weiwei Xie, Shaoxiong Guo, Fan Zhang, Tian Xia, Xue Yang, Lizhuang Ma, Junchi Yan, Qibing Ren — *MemEvoBench: Benchmarking Memory MisEvolution in LLM Agents* (arXiv 2604.15774, 2026-04-17)

EP002 — Memory That Slowly Turns (MemEvoBench)

Cross-domain connection

Concepts introduced

Source paper

Subscribe