Beyond Reactivity: Measuring Proactive Problem Solving in LLM Agents

This research introduces PROBE, a benchmark decomposing proactive problem solving in LLM agents into latent state inference, bottleneck detection, and autono...

Level: advanced

By Gil Pasternak and 6 other authors

Category: research