Act or Escalate? Evaluating Escalation Behavior in Automation with Language Models

This research investigates how language models balance autonomous action against human escalation, revealing critical flaws in their self-assessment of corre...

Level: advanced

By Matthew DosSantos DiSorbo, Harang Ju

Category: research