📄️ Intro
LLM red teaming is a way to find vulnerabilities in AI systems before they're deployed by using simulated adversarial inputs.
📄️ Types of LLM vulnerabilities
This page documents categories of potential LLM vulnerabilities and failure modes.
📄️ Quickstart
Promptfoo is an open-source tool for red teaming gen AI applications.
📄️ Configuration
The redteam section in your promptfooconfig.yaml file is used when generating redteam tests via promptfoo redteam run or promptfoo redteam generate. It allows you to specify the plugins and other parameters of your redteam tests.
📄️ Architecture
Promptfoo automated red teaming consists of three main components: plugins, strategies, and targets.
📄️ OWASP LLM Top 10
The OWASP Top 10 for Large Language Model Applications educates developers about security risks in deploying and managing LLMs. It lists the top critical vulnerabilities in LLM applications based on impact, exploitability, and prevalence. OWASP recently released its updated version of the Top 10 for LLMs for 2025.
📄️ Guardrails
Guardrails are an active mitigation solution to LLM security, implemented to control and monitor user interactions. They help prevent misuse, detect potential security risks, and ensure appropriate model behavior by filtering or blocking problematic inputs and outputs.
🗃️ Plugins
34 items
🗃️ Strategies
20 items
🗃️ Troubleshooting
6 items
📄️ How to red team RAG applications
Retrieval-Augmented Generation (RAG) is an increasingly popular LLM-based architecture for knowledge-based AI products. This guide focuses on application-layer attacks that developers deploying RAGs should consider.
📄️ How to red team Agents
LLM agents capable of interacting with their environment and executing complex tasks using natural language interfaces. As these agents gain access to external systems and sensitive data, security assessments are essential.
📄️ How to red team LLM applications
Promptfoo is a popular open source evaluation framework that includes LLM red team and penetration testing capabilities.