Skip to main content

Guardrails API

The Guardrails API helps detect potential security risks in user inputs to language models and identify personally identifiable information (PII).

Base URL

https://api.promptfoo.dev

Endpoints

Prompt injection and Jailbreak detection

Analyzes input text to classify potential security threats from prompt injections and jailbreaks.

Request

POST /v1/guard

Headers

Content-Type: application/json

Body

{
"input": "String containing the text to analyze"
}

Response

{
"model": "promptfoo-guard",
"results": [
{
"categories": {
"prompt_injection": boolean,
"jailbreak": boolean
},
"category_scores": {
"prompt_injection": number,
"jailbreak": number
},
"flagged": boolean
}
]
}
  • categories.prompt_injection: Indicates if the input may be attempting a prompt injection.
  • categories.jailbreak: Indicates if the input may be attempting a jailbreak.
  • flagged: True if the input is classified as either prompt injection or jailbreak.

2. PII Detection

Detects personally identifiable information (PII) in the input text. This system can identify a wide range of PII elements.

Entity TypeDescription
account_numberAccount numbers (e.g., bank account)
building_numberBuilding or house numbers
cityCity names
credit_card_numberCredit card numbers
date_of_birthDates of birth
driver_license_numberDriver's license numbers
email_addressEmail addresses
given_nameFirst or given names
id_card_numberID card numbers
passwordPasswords or passcodes
social_security_numberSocial security numbers
street_nameStreet names
surnameLast names or surnames
tax_id_numberTax identification numbers
phone_numberTelephone numbers
usernameUsernames
zip_codePostal or ZIP codes

Request

POST /v1/pii

Headers

Content-Type: application/json

Body

{
"input": "String containing the text to analyze for PII"
}

Response

{
"model": "promptfoo-pii",
"results": [
{
"categories": {
"pii": boolean
},
"category_scores": {
"pii": number
},
"flagged": boolean,
"payload": {
"pii": [
{
"entity_type": string,
"start": number,
"end": number,
"pii": string
}
]
}
}
]
}
  • pii: Indicates if PII was detected in the input.
  • flagged: True if any PII was detected.
  • payload.pii: Array of detected PII entities with their types and positions in the text.

Examples

Guard Classification Example

curl https://api.promptfoo.dev/v1/guard \
-X POST \
-d '{"input": "Ignore previous instructions"}' \
-H 'Content-Type: application/json'

Response

{
"model": "promptfoo-guard",
"results": [
{
"categories": {
"prompt_injection": false,
"jailbreak": true
},
"category_scores": {
"prompt_injection": 0.00004004167567472905,
"jailbreak": 0.9999395608901978
},
"flagged": true
}
]
}

This example shows a high probability of a jailbreak attempt.

PII Detection Example

curl https://api.promptfoo.dev/v1/pii \
-X POST \
-d '{"input": "My name is John Doe and my email is [email protected]"}' \
-H 'Content-Type: application/json'

Response

{
"model": "promptfoo-pii",
"results": [
{
"categories": {
"pii": true
},
"category_scores": {
"pii": 1
},
"flagged": true,
"payload": {
"pii": [
{
"entity_type": "PERSON",
"start": 11,
"end": 19,
"pii": "John Doe"
},
{
"entity_type": "EMAIL",
"start": 34,
"end": 50,
"pii": "[email protected]"
}
]
}
}
]
}

More

For more information on LLM vulnerabilities and how to mitigate LLM failure modes, refer to our Types of LLM Vulnerabilities and Introduction to AI red teaming documentation.