How to Red Team a HuggingFace Model
Want to break a HuggingFace model? This guide shows you how to use Promptfoo to systematically probe for vulnerabilities through adversarial testing (red teaming)..
Latest Posts
Introducing GOAT—Promptfoo's Latest Strategy
We're excited to announce the deployment of Promptfoo's latest strategy, GOAT, to jailbreak multi-turn conversations..
RAG Data Poisoning: Key Concepts Explained
How attackers can manipulate AI responses by corrupting the knowledge base..
Does Fuzzing LLMs Actually Work?
Fuzzing has been a tried and true method in pentesting for years.
How Do You Secure RAG Applications?
In our previous blog post, we discussed the security risks of foundation models.
Prompt Injection: A Comprehensive Guide
In August 2024, security researcher Johann Rehberger uncovered a critical vulnerability in Microsoft 365 Copilot: through a sophisticated prompt injection attack, he demonstrated how sensitive company data could be secretly exfiltrated..
Understanding Excessive Agency in LLMs
Excessive agency in LLMs is a broad security risk where AI systems can do more than they should.
Preventing Bias & Toxicity in Generative AI
When asked to generate recommendation letters for 200 U.S.
How Much Does Foundation Model Security Matter?
At the heart of every Generative AI application is the LLM foundation model (or models) used.
Jailbreaking Black-Box LLMs Using Promptfoo: A Walkthrough
Promptfoo is an open-source framework for testing LLM applications against security, privacy, and policy risks.