EP002 — Memory That Slowly Turns (MemEvoBench)
Last episode we talked about keeping AI agents from being attacked. Today we look at the failure mode that emerges when no one is attacking the agent at all — when the agent's own memory drifts over time through accumulated biased input. Memory misevolution as a path-dependent phenomenon, with institutional drift and the Overton window as the cross-domain parallel.
Cross-domain connection
Institutional drift and path-dependent belief formation. Individuals, institutions, and media ecosystems do not fail by being persuaded by a single compelling falsehood — they fail by accumulating biased inputs whose individual weight is insufficient but cumulative effect shifts the evaluative baseline. Universally recognizable structural pattern (Overton window, mission drift, repeated-exposure belief formation). Holds on the drift-mechanism shared across substrates; breaks on corrective mechanisms (institutions have audit, peer pressure, contradiction; default LLM agents have none).
Concepts introduced
- Persistent agent memory (the "notebook" framing — defined here for audio audience even though it also appeared in text EP003)
- Memory misevolution as a NAMED phenomenon (drift through accumulated biased inputs, not single-prompt attack)
- Three attack vectors: adversarial memory injection, noisy tool outputs, biased feedback (briefly named, not exhaustively detailed)
- Path-dependence — what counts as plausible at time T is set by inputs at T-1 (institutional analogy carries this)
- Multi-turn vs. single-turn agent evaluation as a methodological gap