📄️ Intro
LLM red teaming is a way to find vulnerabilities in AI systems before they're deployed by using simulated adversarial inputs.
📄️ Types of LLM vulnerabilities
This page documents categories of potential LLM vulnerabilities and failure modes.
📄️ Quickstart
Promptfoo is an open-source tool for red teaming gen AI applications.
📄️ Configuration
The redteam section in your promptfooconfig.yaml file is used when generating redteam tests via promptfoo redteam run or promptfoo redteam generate. It allows you to specify the plugins and other parameters of your redteam tests.
📄️ Architecture
Promptfoo automated red teaming consists of three main components: plugins, strategies, and targets.
📄️ OWASP LLM Top 10
The OWASP Top 10 for Large Language Model Applications educates developers about security risks in deploying and managing LLMs. It lists the top critical vulnerabilities in LLM applications based on impact, exploitability, and prevalence. OWASP recently released its updated version of the Top 10 for LLMs for 2025.
📄️ Guardrails
The Guardrails API helps detect potential security risks in user inputs to language models, identify personally identifiable information (PII), and assess potential harm in content.
🗃️ Plugins
31 items
🗃️ Strategies
17 items
🗃️ Troubleshooting
6 items
📄️ How to red team RAG applications
Retrieval-Augmented Generation (RAG) is an increasingly popular LLM-based architecture for knowledge-based AI products. This guide focuses on application-layer attacks that developers deploying RAGs should consider.
📄️ How to red team Agents
LLM agents capable of interacting with their environment and executing complex tasks using natural language interfaces. As these agents gain access to external systems and sensitive data, security assessments are essential.
📄️ How to red team LLM applications
Promptfoo is a popular open source evaluation framework that includes LLM red team and penetration testing capabilities.