LLM Providers

Providers in promptfoo are the interfaces to various language models and AI services. This guide will help you understand how to configure and use providers in your promptfoo evaluations.

Quick Start

Here's a basic example of configuring providers in your promptfoo YAML config:

providers:
  - anthropic:messages:claude-sonnet-4-20250514
  - openai:gpt-4.1
  - openai:o4-mini
  - google:gemini-2.5-pro-preview-06-05
  - vertex:gemini-2.5-pro-exp-03-25
  - mistral:magistral-medium-latest
  - mistral:magistral-small-latest

Available Providers

Api Providers	Description	Syntax & Example
OpenAI	GPT models including GPT-4.1 and reasoning models	`openai:gpt-4.1` or `openai:o4-mini`
Anthropic	Claude models	`anthropic:messages:claude-sonnet-4-20250514`
HTTP	Generic HTTP-based providers	`https://api.example.com/v1/chat/completions`
Javascript	Custom - JavaScript file	`file://path/to/custom_provider.js`
Python	Custom - Python file	`file://path/to/custom_provider.py`
Shell Command	Custom - script-based providers	`exec: python chain.py`
AI21 Labs	Jurassic and Jamba models	`ai21:jamba-1.5-mini`
AWS Bedrock	AWS-hosted models from various providers	`bedrock:us.meta.llama3-2-90b-instruct-v1:0`
Amazon SageMaker	Models deployed on SageMaker endpoints	`sagemaker:my-endpoint-name`
Azure OpenAI	Azure-hosted OpenAI models	`azureopenai:gpt-4o-custom-deployment-name`
Cerebras	High-performance inference API for Llama models	`cerebras:llama-4-scout-17b-16e-instruct`
Adaline Gateway	Unified interface for multiple providers	Compatible with OpenAI syntax
Cloudflare AI	Cloudflare's AI platform	`cloudflare-ai:@cf/meta/llama-3-8b-instruct`
Cohere	Cohere's language models	`cohere:command`
DeepSeek	DeepSeek's language models	`deepseek:deepseek-chat`
F5	OpenAI-compatible AI Gateway interface	`f5:path-name`
fal.ai	Image Generation Provider	`fal:image:fal-ai/fast-sdxl`
Fireworks AI	Various hosted models	`fireworks:accounts/fireworks/models/qwen-v2p5-7b`
GitHub	GitHub AI Gateway	`github:gpt-4.1`
Google AI Studio	Gemini models	`google:gemini-2.5-pro`, `google:gemini-2.5-flash`
Google Vertex AI	Google Cloud's AI platform	`vertex:gemini-2.5-pro`, `vertex:gemini-2.5-flash`
Groq	High-performance inference API	`groq:llama-3.3-70b-versatile`
Hyperbolic	OpenAI-compatible Llama 3 provider	`hyperbolic:meta-llama/Llama-3.3-70B-Instruct`
Hugging Face	Access thousands of models	`huggingface:text-generation:gpt2`
IBM BAM	IBM's foundation models	`bam:chat:ibm/granite-13b-chat-v2`
JFrog ML	JFrog's LLM Model Library	`jfrog:llama_3_8b_instruct`
Lambda Labs	Access Lambda Labs models via their Inference API	`lambdalabs:model-name`
LiteLLM	Unified interface for multiple providers	Compatible with OpenAI syntax
Mistral AI	Mistral's language models	`mistral:open-mistral-nemo`
OpenLLM	BentoML's model serving framework	Compatible with OpenAI syntax
OpenRouter	Unified API for multiple providers	`openrouter:mistral/7b-instruct`
Perplexity AI	Search-augmented chat with citations	`perplexity:sonar-pro`
Replicate	Various hosted models	`replicate:stability-ai/sdxl`
Together AI	Various hosted models	Compatible with OpenAI syntax
Voyage AI	Specialized embedding models	`voyage:voyage-3`
vLLM	Local	Compatible with OpenAI syntax
Ollama	Local	`ollama:llama3.2:latest`
LocalAI	Local	`localai:gpt4all-j`
Llamafile	OpenAI-compatible llamafile server	Uses OpenAI provider with custom endpoint
llama.cpp	Local	`llama:7b`
Text Generation WebUI	Gradio WebUI	Compatible with OpenAI syntax
WebSocket	WebSocket-based providers	`ws://example.com/ws`
Webhook	Custom - Webhook integration	`webhook:http://example.com/webhook`
Echo	Custom - For testing purposes	`echo`
Manual Input	Custom - CLI manual entry	`promptfoo:manual-input`
Go	Custom - Go file	`file://path/to/your/script.go`
Web Browser	Custom - Automate web browser interactions	`browser`
Sequence	Custom - Multi-prompt sequencing	`sequence` with config.inputs array
Simulated User	Custom - Conversation simulator	`promptfoo:simulated-user`
WatsonX	IBM's WatsonX	`watsonx:ibm/granite-13b-chat-v2`
X.AI	X.AI's models	`xai:grok-3-beta`

Provider Syntax

Providers are specified using various syntax options:

Simple string format:
```
provider_name:model_name
```
Example: openai:gpt-4.1 or anthropic:claude-sonnet-4-20250514

Object format with configuration:

- id: provider_name:model_name
  config:
    option1: value1
    option2: value2

Example:

- id: openai:gpt-4.1
  config:
    temperature: 0.7
    max_tokens: 150

File-based configuration:

Load a single provider:

provider.yaml
id: openai:chat:gpt-4.1
config:
  temperature: 0.7

Or multiple providers:

providers.yaml
- id: openai:gpt-4.1
  config:
    temperature: 0.7
- id: anthropic:messages:claude-sonnet-4-20250514
  config:
    max_tokens: 1000

Reference in your configuration:

promptfooconfig.yaml
providers:
  - file://provider.yaml # single provider as an object
  - file://providers.yaml # multiple providers as an array

Configuring Providers

Most providers use environment variables for authentication:

export OPENAI_API_KEY=your_api_key_here
export ANTHROPIC_API_KEY=your_api_key_here

You can also specify API keys in your configuration file:

providers:
  - id: openai:gpt-4.1
    config:
      apiKey: your_api_key_here

Custom Integrations

promptfoo supports several types of custom integrations:

File-based providers:

providers:
  - file://path/to/provider_config.yaml

JavaScript providers:

providers:
  - file://path/to/custom_provider.js

Python providers:

providers:
  - id: file://path/to/custom_provider.py

HTTP/HTTPS API:

providers:
  - id: https://api.example.com/v1/chat/completions
    config:
      headers:
        Authorization: 'Bearer your_api_key'

WebSocket:

providers:
  - id: ws://example.com/ws
    config:
      messageTemplate: '{"prompt": "{{prompt}}"}'

Custom scripts:
```
providers:
  - 'exec: python chain.py'
```

Common Configuration Options

Many providers support these common configuration options:

temperature: Controls randomness (0.0 to 1.0)
max_tokens: Maximum number of tokens to generate
top_p: Nucleus sampling parameter
frequency_penalty: Penalizes frequent tokens
presence_penalty: Penalizes new tokens based on presence in text
stop: Sequences where the API will stop generating further tokens

Example:

providers:
  - id: openai:gpt-4.1
    config:
      temperature: 0.7
      max_tokens: 150
      top_p: 0.9
      frequency_penalty: 0.5
      presence_penalty: 0.5
      stop: ["\n", 'Human:', 'AI:']

Model Context Protocol (MCP)

Promptfoo supports the Model Context Protocol (MCP) for enabling advanced tool use and agentic capabilities in LLM providers. MCP allows you to connect providers to external MCP servers to enable tool orchestration, memory, and more.

Basic MCP Configuration

Enable MCP for a provider by adding the mcp block to your provider's configuration:

providers:
  - id: openai:gpt-4.1
    config:
      temperature: 0.7
      mcp:
        enabled: true
        server:
          command: npx
          args: ['-y', '@modelcontextprotocol/server-memory']
          name: memory

Multiple MCP Servers

You can connect a single provider to multiple MCP servers:

providers:
  - id: openai:gpt-4.1
    config:
      mcp:
        enabled: true
        servers:
          - command: npx
            args: ['-y', '@modelcontextprotocol/server-memory']
            name: server_a
          - url: http://localhost:8001
            name: server_b

For detailed MCP documentation and advanced configurations, see the MCP Integration Guide.

Quick Start​

Available Providers​

Provider Syntax​

Configuring Providers​

Custom Integrations​

Common Configuration Options​

Model Context Protocol (MCP)​

Basic MCP Configuration​

Multiple MCP Servers​