When Agents Fail to Act: A Diagnostic Framework for Tool Invocation Reliability in Multi-Agent LLM Systems
This research introduces a comprehensive 12-category error taxonomy for diagnosing tool invocation failures in multi-agent LLM systems, offering a scalable f...