Language Model Security Database
A comprehensive collection of LLM vulnerabilities, curated from cutting-edge research papers and real-world discoveries.
Total Entries
Research Papers
Affected Models
Last Updated
Filter by Tags
Attack Surface
Attack Type
Architecture Components
Attack Context
Impact
Filter by Models
Recent Vulnerabilities
Distributed Backdoor in Multi-Agent Systems
A distributed backdoor vulnerability, named "Collaborative Shadows", exists in LLM-based Multi-Agent Systems (MAS) that rely on external or modifiable tools. An attacker can poison multiple agent tools by embedding inert, encrypted "attack primitives" within them. These primitives are fragments of a larger malicious payload. A carefully crafted user instruction acts as both a trigger and a decryption key. The instruction steers the agents to collaborate in a specific sequence, causing them to invoke the poisoned tools in a predefined order. The encrypted primitives are released into the agents' observations and memory. After task completion, the attacker can scan the execution trace or agent memories for the primitives, decrypt them using the initial instruction, and reassemble them to execute the full malicious payload, such as exfiltrating sensitive data processed by the agents. The attack exploits the inter-agent collaboration process itself, and since the backdoor is decentralized and its components are individually benign, it can evade detection by tools that only inspect individual agents or tools in isolation. See arXiv:2405.18540.
© 2025 Promptfoo. All rights reserved.