The Productive Half-Life of AI Agents
· 6 min read
Leaders keep asking a version of the same question: how long can an agent run before a human needs to step in? Call that interval the productive half-life. It is the period where an AI agent stays useful, safe, and aligned without direct oversight. There is solid research behind this framing. METR’s time-horizon studies estimate task duration at 50% reliability across domains. OSWorld, RealWebAssist, TOOLATHLON, and SWE-bench Verified test long, messy task chains rather than isolated steps. Human-automation trust work (Lee & See; CHI 2019 Guidelines for Human–AI Interaction) plus mixed-initiative research (Horvitz; Hearst; Bradshaw) also helps teams decide when people should re-enter the loop.
