Vulnerabilities in Retrieval-Augmented Generation systems
Large Language Model (LLM) tool-calling systems are vulnerable to adversarial tool injection attacks. Attackers can inject malicious tools ("Manipulator Tools") into the tool platform, manipulating the LLM's tool selection and execution process. This allows for privacy theft (extracting user queries), denial-of-service (DoS) attacks against legitimate tools, and unscheduled tool-calling (forcing the use of attacker-specified tools regardless of relevance). The attack exploits vulnerabilities in the tool retrieval mechanism and the LLM's decision-making process. Successful attacks require the malicious tool to be (1) retrieved by the system, (2) selected for execution by the LLM, and (3) its output to manipulate subsequent LLM actions.
CVE-2024-XXXX
A poisoning attack against a Retrieval-Augmented Generation (RAG) system that manipulates the retriever component by injecting a poisoned document into the data used by the embedding model. This poisoned document contains modified and incorrect information. When activated, the system retrieves the poisoned document and uses it to generate misleading, biased, and unfaithful responses to user queries.
Jailbreaking vulnerabilities in Large Language Models (LLMs) used in Retrieval-Augmented Generation (RAG) systems allow escalation of attacks from entity extraction to full document extraction and enable the propagation of self-replicating malicious prompts ("worms") within interconnected RAG applications. Exploitation leverages prompt injection to force the LLM to return retrieved documents or execute malicious actions specified within the prompt.
A vulnerability in Retrieval-Augmented Generation (RAG)-based Large Language Model (LLM) agents allows attackers to inject malicious demonstrations into the agent's memory or knowledge base. By crafting a carefully optimized trigger, an attacker can manipulate the agent's retrieval mechanism to preferentially retrieve these poisoned demonstrations, causing the agent to produce adversarial outputs or take malicious actions even when seemingly benign prompts are used. The attack, termed AgentPoison, does not require model retraining or fine-tuning.
A vulnerability in Retrieval-Augmented Generation (RAG) systems utilizing LangChain allows for indirect jailbreaks of Large Language Models (LLMs). By poisoning the external knowledge base accessed by the LLM through LangChain, attackers can manipulate the LLM's responses, causing it to generate malicious or inappropriate content. The attack exploits the LLM's reliance on the external knowledge base and bypasses direct prompt-based jailbreak defenses.
Large Language Models (LLMs) utilizing Retrieval Augmented Generation (RAG) are vulnerable to a novel attack vector, termed "RAG Poisoning," where malicious content is injected into the external knowledge base accessed by the LLM via prompt manipulation. This allows attackers to elicit undesirable or malicious outputs from the LLM, bypassing its safety filters. The attack exploits the LLM's reliance on the retrieved information during response generation.
© 2025 Promptfoo. All rights reserved.