Pinpointing the Culprit: Automated Failure Attribution in Multi-Agent LLM Systems
The Growing Complexity of Multi-Agent Systems
Large language model (LLM) multi-agent systems have become a cornerstone for tackling complex tasks through collaborative effort. By breaking down problems into subtasks and assigning them to specialized agents, these systems can achieve remarkable results. However, with great complexity comes great fragility. A single misstep—whether an agent misinterprets instructions, fails to share critical information, or produces an erroneous output—can derail the entire project. Developers often find themselves sifting through extensive interaction logs, a process akin to finding a needle in a haystack, to identify the root cause of a failure. This manual debugging is not only time‑consuming but also heavily reliant on the developer’s deep expertise, slowing down system iteration and optimization.

Why Failures Are Hard to Diagnose
The autonomous nature of agent collaboration leads to long information chains where mistakes can propagate silently. Even when a system fails visibly, the exact moment and responsible agent may remain hidden. For example, a planning agent might pass an ambiguous goal to a research agent, which then retrieves incomplete data, causing a synthesis agent to generate a flawed report. Without automated tools, developers must replay the entire sequence to spot the error.
Introducing Automated Failure Attribution
To address this challenge, researchers from Penn State University and Duke University, in collaboration with Google DeepMind, the University of Washington, Meta, Nanyang Technological University, and Oregon State University, have proposed a novel research problem: automated failure attribution. The goal is to automatically determine which agent caused a failure and when in the process the mistake occurred. As detailed in the Who&When benchmark, the team created the first dedicated dataset for this task and evaluated multiple attribution methods.
The Who&When Benchmark
The benchmark, named Who&When, consists of multi‑agent interaction logs with labeled failure points. It encompasses diverse scenarios such as code generation, question answering, and data analysis. Each log records the sequence of agent actions, messages, and intermediate outputs. The research team then manually annotated the logs to identify the responsible agent and the timestamp of the failure. This dataset serves as a standardized testbed for developing and comparing automated attribution techniques.
Research Contributions and Impact
The paper, accepted as a Spotlight presentation at the top‑tier machine learning conference ICML 2025, presents several key contributions:
- Problem formulation – Defining automated failure attribution as a distinct research challenge in multi‑agent systems.
- Benchmark creation – The Who&When dataset provides a foundational resource for the community.
- Method evaluation – The authors tested baseline methods, including log‑based analysis, causal tracing, and LLM‑based reasoning, revealing that even state‑of‑the‑art techniques struggle with complex failures.
By enabling faster and more accurate failure diagnosis, this work paves the way for more reliable LLM‑powered systems. Developers can now focus on fixing the actual problem rather than spending hours hunting for it.
Open‑Source Resources
To accelerate research and practical application, the team has made all resources publicly available:
- Paper: arXiv preprint
- Code: GitHub repository
- Dataset: Hugging Face dataset
These open‑source assets empower other researchers and developers to build upon this work, ultimately making multi‑agent systems more robust and easier to debug.
Related Articles
- GitHub Deploys Security Shield for AI Coding Agents to Block Attacks at the Tool Layer
- RaaS Group The Gentlemen Surges With 320+ Victims as Proxy Malware SystemBC Tunnels Into Corporate Networks
- Brewing Science: How Electrical Currents Could Revolutionize Coffee Flavor Analysis
- Critical PhantomRPC Flaw Enables SYSTEM-Level Privilege Escalation Across All Windows Versions
- Quantum Communication Breakthrough: Single Photons Transmitted Over Standard Fiber Networks
- Rapid Rise of The Gentlemen RaaS: Over 320 Victims and a 1,570-Device Botnet Exposed
- Navigating the Artemis 3 Delay: A Comprehensive Guide to NASA's Revised Lunar Timeline and the 2028 Moon Landing Outlook
- A Comprehensive Guide to China's 2026 Energy Transition and Climate Resilience Policies