Research SentinelBench tests whether AI agents can wait well SentinelBench is a 100-task benchmark for monitoring agents. Use it to test patience, latency and tool spend before you ship. Lars Cornelissen · Jun 6, 2026