Rag Vulnerabilities

Vulnerabilities in Retrieval-Augmented Generation systems

Related Vulnerabilities

8 entries

Enterprise Multi-Turn Data Exfiltration

7/28/2025

Large Language Model (LLM) systems integrated with private enterprise data, such as those using Retrieval-Augmented Generation (RAG), are vulnerable to multi-stage prompt inference attacks. An attacker can use a sequence of individually benign-looking queries to incrementally extract confidential information from the LLM's context. Each query appears innocuous in isolation, bypassing safety filters designed to block single malicious prompts. By chaining these queries, the attacker can reconstruct sensitive data from internal documents, emails, or other private sources accessible to the LLM. The attack exploits the conversational context and the model's inability to recognize the cumulative intent of a prolonged, strategic dialogue.

Multi-Stage Prompt Inference Attacks on Enterprise LLM Systems

Affects: gpt-4, gpt-3, gpt-2, roberta, gemini

Adversarial Tool Injection Attacks

12/29/2024

Large Language Model (LLM) tool-calling systems are vulnerable to adversarial tool injection attacks. Attackers can inject malicious tools ("Manipulator Tools") into the tool platform, manipulating the LLM's tool selection and execution process. This allows for privacy theft (extracting user queries), denial-of-service (DoS) attacks against legitimate tools, and unscheduled tool-calling (forcing the use of attacker-specified tools regardless of relevance). The attack exploits vulnerabilities in the tool retrieval mechanism and the LLM's decision-making process. Successful attacks require the malicious tool to be (1) retrieved by the system, (2) selected for execution by the LLM, and (3) its output to manipulate subsequent LLM actions.

From Allies to Adversaries: Manipulating LLM Tool-Calling through Adversarial Injection

Affects: gpt-4, llama 3, qwen2

Agent Action Hijacking

3/19/2025

CVE-2024-XXXX

Towards Action Hijacking of Large Language Model-based Agent

Affects: llama, vicuna, qwen2, alpaca, gpt-3, gpt-4, minilm, m3e, bert

BarkPlug Data Poisoning Attack

3/19/2025

A poisoning attack against a Retrieval-Augmented Generation (RAG) system that manipulates the retriever component by injecting a poisoned document into the data used by the embedding model. This poisoned document contains modified and incorrect information. When activated, the system retrieves the poisoned document and uses it to generate misleading, biased, and unfaithful responses to user queries.

Poison Attacks and Adversarial Prompts Against an Informed University Virtual Assistant

Affects: barkplug v.2

RAG Worm Jailbreak

12/29/2024

Jailbreaking vulnerabilities in Large Language Models (LLMs) used in Retrieval-Augmented Generation (RAG) systems allow escalation of attacks from entity extraction to full document extraction and enable the propagation of self-replicating malicious prompts ("worms") within interconnected RAG applications. Exploitation leverages prompt injection to force the LLM to return retrieved documents or execute malicious actions specified within the prompt.

Unleashing worms and extracting data: Escalating the outcome of attacks against rag-based inference in scale and severity using jailbreaking

Affects: gemini 1.5 flash

LLM Memory Poisoning Attack

12/28/2024

A vulnerability in Retrieval-Augmented Generation (RAG)-based Large Language Model (LLM) agents allows attackers to inject malicious demonstrations into the agent's memory or knowledge base. By crafting a carefully optimized trigger, an attacker can manipulate the agent's retrieval mechanism to preferentially retrieve these poisoned demonstrations, causing the agent to produce adversarial outputs or take malicious actions even when seemingly benign prompts are used. The attack, termed AgentPoison, does not require model retraining or fine-tuning.

Agentpoison: Red-teaming llm agents via poisoning memory or knowledge bases

Affects: gpt-3.5, llama3-70b, llama3-8b, text-embedding-ada-002, gpt-2

LangChain Poisoning Jailbreak

12/29/2024

A vulnerability in Retrieval-Augmented Generation (RAG) systems utilizing LangChain allows for indirect jailbreaks of Large Language Models (LLMs). By poisoning the external knowledge base accessed by the LLM through LangChain, attackers can manipulate the LLM's responses, causing it to generate malicious or inappropriate content. The attack exploits the LLM's reliance on the external knowledge base and bypasses direct prompt-based jailbreak defenses.

Poisoned langchain: Jailbreak llms by langchain

Affects: chatglm2-6b, chatglm3-6b, xinghuo-3.5, qwen-14b-chat, ernie-3.5, llama2-7b

RAG Poisoning Jailbreak

12/29/2024

Large Language Models (LLMs) utilizing Retrieval Augmented Generation (RAG) are vulnerable to a novel attack vector, termed "RAG Poisoning," where malicious content is injected into the external knowledge base accessed by the LLM via prompt manipulation. This allows attackers to elicit undesirable or malicious outputs from the LLM, bypassing its safety filters. The attack exploits the LLM's reliance on the retrieved information during response generation.

Pandora: Jailbreak gpts by retrieval augmented generation poisoning

Affects: gpt-3.5, gpt-4, mistral-7b