Skip to main content

Latest Posts

GPT-5.2 Initial Trust and Safety Assessment
Red Teaming

GPT-5.2 Initial Trust and Safety Assessment

Day-0 red team results for GPT-5.2.

Michael D'AngeloDec 11, 2025
Your model upgrade just broke your agent's safety
Security Vulnerability

Your model upgrade just broke your agent's safety

Model upgrades can change refusal, instruction-following, and tool-use behavior.

Guangshuo ZangDec 8, 2025
Real-Time Fact Checking for LLM Outputs
Feature Announcement

Real-Time Fact Checking for LLM Outputs

Promptfoo now supports web search in assertions, so you can verify time-sensitive information like stock prices, weather, and case citations during testing..

Michael D'AngeloNov 28, 2025
How to replicate the Claude Code attack with Promptfoo
AI Security

How to replicate the Claude Code attack with Promptfoo

A recent cyber espionage campaign revealed how state actors weaponized Anthropic's Claude Code - not through traditional hacking, but by convincing the AI itself to carry out malicious operations..

Ian WebsterNov 17, 2025
Will agents hack everything?
Security Vulnerability

Will agents hack everything?

The first state-level AI cyberattack raises hard questions: Can we stop AI agents from helping attackers? Should we?.

Dane SchneiderNov 14, 2025
When AI becomes the attacker: The rise of AI-orchestrated cyberattacks
Security Vulnerability

When AI becomes the attacker: The rise of AI-orchestrated cyberattacks

Google's November 2025 discovery of PROMPTFLUX and PROMPTSTEAL confirms Anthropic's August threat intelligence findings on AI-orchestrated attacks.

Michael D'AngeloNov 10, 2025
Reinforcement Learning with Verifiable Rewards Makes Models Faster, Not Smarter
Technical Guide

Reinforcement Learning with Verifiable Rewards Makes Models Faster, Not Smarter

RLVR trains reasoning models with programmatic verifiers instead of human labels.

Michael D'AngeloOct 24, 2025
Top 10 Open Datasets for LLM Safety, Toxicity & Bias Evaluation
AI Safety

Top 10 Open Datasets for LLM Safety, Toxicity & Bias Evaluation

A comprehensive guide to the most important open-source datasets for evaluating LLM safety, including toxicity detection, bias measurement, and truthfulness benchmarks..

Ian WebsterOct 6, 2025
Testing AI’s “Lethal Trifecta” with Promptfoo
Security Vulnerability

Testing AI’s “Lethal Trifecta” with Promptfoo

Learn what the lethal trifecta is and how to use promptfoo red teaming to detect prompt injection and data exfiltration risks in AI agents..

Ian WebsterSep 28, 2025
Autonomy and agency in AI: We should secure LLMs with the same fervor spent realizing AGI
AI Safety

Autonomy and agency in AI: We should secure LLMs with the same fervor spent realizing AGI

Exploring the critical need to secure LLMs with the same urgency and resources dedicated to achieving AGI, focusing on autonomy and agency in AI systems..

Tabs FakierSep 2, 2025
Prompt Injection vs Jailbreaking: What's the Difference?
Security Vulnerability

Prompt Injection vs Jailbreaking: What's the Difference?

Learn the critical difference between prompt injection and jailbreaking attacks, with real CVEs, production defenses, and test configurations..

Michael D'AngeloAug 18, 2025
AI Safety vs AI Security in LLM Applications: What Teams Must Know
AI Security

AI Safety vs AI Security in LLM Applications: What Teams Must Know

AI safety vs AI security for LLM apps.

Michael D'AngeloAug 17, 2025
Top Open Source AI Red-Teaming and Fuzzing Tools in 2025
Red Teaming

Top Open Source AI Red-Teaming and Fuzzing Tools in 2025

Compare the top open source AI red teaming tools in 2025.

Tabs FakierAug 14, 2025
Promptfoo Raises $18.4M Series A to Build the Definitive AI Security Stack
Company Update

Promptfoo Raises $18.4M Series A to Build the Definitive AI Security Stack

We raised $18.4M from Insight Partners with participation from Andreessen Horowitz.

Ian WebsterJul 29, 2025
Evaluating political bias in LLMs
Research Analysis

Evaluating political bias in LLMs

How right-leaning is Grok? We've released a new testing methodology alongside a dataset of 2,500 political questions..

Michael D'AngeloJul 24, 2025
Join Promptfoo at Hacker Summer Camp 2025
Company Update

Join Promptfoo at Hacker Summer Camp 2025

Join Promptfoo at AI Summit, Black Hat, and DEF CON for demos, workshops, and discussions on LLM security and red teaming.

Vanessa SauterJul 24, 2025
AI Red Teaming for complete first-timers
Red Teaming

AI Red Teaming for complete first-timers

A comprehensive guide to AI red teaming for beginners, covering the basics, culture building, and operational feedback loops.

Tabs FakierJul 22, 2025
System Cards Go Hard
Research Analysis

System Cards Go Hard

Master LLM system cards for responsible AI deployment with transparency, safety documentation, and compliance requirements.

Tabs FakierJul 15, 2025