EP001 — Securing Agents That Use Tools (ClawGuard)
AI agents that can use tools — browse the web, read files, call APIs — face a serious vulnerability called indirect prompt injection. Today we look at ClawGuard, a runtime security framework that takes a different approach than training models to refuse: it sits between the AI and its tools, checking each action against rules the user has pre-approved. The shift from behavioral to architectural security, with hospital surgical safety checklists as the cross-domain parallel.
Cross-domain connection
Hospital surgical safety checklists / Atul Gawande / WHO Safe Surgery Saves Lives. Architectural intervention (checklist enforces verification at execution boundaries) over behavioral intervention (training surgeons more).
Concepts introduced
- Indirect prompt injection (named, with web/files/APIs scoping)
- Behavioral vs. architectural security (the show's running frame, established here at the outset)
- Tool-boundary access control as architectural pattern
- Why training models to resist manipulation doesn't reliably work