LLM Providers
Providers in promptfoo are the interfaces to various language models and AI services. This guide will help you understand how to configure and use providers in your promptfoo evaluations.
Quick Start
Here's a basic example of configuring providers in your promptfoo YAML config:
providers:
- openai:gpt-4o-mini
- anthropic:messages:claude-3-5-sonnet-20241022
- vertex:gemini-pro
Available Providers
Api Providers | Description | Syntax & Example |
---|---|---|
OpenAI | GPT models including GPT-4 and GPT-3.5 | openai:o1-preview |
Anthropic | Claude models | anthropic:messages:claude-3-5-sonnet-20241022 |
HTTP | Generic HTTP-based providers | https://api.example.com/v1/chat/completions |
Javascript | Custom - JavaScript file | file://path/to/custom_provider.js |
Python | Custom - Python file | file://path/to/custom_provider.py |
Shell Command | Custom - script-based providers | exec: python chain.py |
AI21 Labs | Jurassic and Jamba models | ai21:jamba-1.5-mini |
AWS Bedrock | AWS-hosted models from various providers | bedrock:us.meta.llama3-2-90b-instruct-v1:0 |
Azure OpenAI | Azure-hosted OpenAI models | azureopenai:gpt-4o-custom-deployment-name |
Cloudflare AI | Cloudflare's AI platform | cloudflare-ai:@cf/meta/llama-3-8b-instruct |
Cohere | Cohere's language models | cohere:command |
fal.ai | Image Generation Provider | fal:image:fal-ai/fast-sdxl |
Google AI Studio (PaLM) | Gemini and PaLM models | google:gemini-pro |
Google Vertex AI | Google Cloud's AI platform | vertex:gemini-pro |
Groq | High-performance inference API | groq:llama3-70b-8192-tool-use-preview |
Hugging Face | Access thousands of models | huggingface:text-generation:gpt2 |
IBM BAM | IBM's foundation models | bam:chat:ibm/granite-13b-chat-v2 |
LiteLLM | Unified interface for multiple providers | Compatible with OpenAI syntax |
Mistral AI | Mistral's language models | mistral:open-mistral-nemo |
OpenLLM | BentoML's model serving framework | Compatible with OpenAI syntax |
OpenRouter | Unified API for multiple providers | openrouter:mistral/7b-instruct |
Perplexity AI | Specialized in question-answering | Compatible with OpenAI syntax |
Replicate | Various hosted models | replicate:stability-ai/sdxl |
Together AI | Various hosted models | Compatible with OpenAI syntax |
Voyage AI | Specialized embedding models | voyage:voyage-3 |
vLLM | Local | Compatible with OpenAI syntax |
Ollama | Local | ollama:llama3.2:latest |
LocalAI | Local | localai:gpt4all-j |
llama.cpp | Local | llama:7b |
WebSocket | WebSocket-based providers | ws://example.com/ws |
Echo | Custom - For testing purposes | echo |
Manual Input | Custom - CLI manual entry | promptfoo:manual-input |
Go | Custom - Go file | file://path/to/your/script.go |
Web Browser | Custom - Automate web browser interactions | browser |
Text Generation WebUI | Gradio WebUI | Compatible with OpenAI syntax |
WatsonX | IBM's WatsonX | watsonx:ibm/granite-13b-chat-v2 |
X.AI | X.AI's models | xai:grok-2 |
Adaline Gateway | Unified interface for multiple providers | Compatible with OpenAI syntax |
Provider Syntax
Providers are specified using various syntax options:
-
Simple string format:
provider_name:model_name
Example:
openai:gpt-4o-mini
oranthropic:claude-3-sonnet-20240229
-
Object format with configuration:
- id: provider_name:model_name
config:
option1: value1
option2: value2Example:
- id: openai:gpt-4o-mini
config:
temperature: 0.7
max_tokens: 150 -
File-based configuration:
- file://path/to/provider_config.yaml
Configuring Providers
Most providers use environment variables for authentication:
export OPENAI_API_KEY=your_api_key_here
export ANTHROPIC_API_KEY=your_api_key_here
You can also specify API keys in your configuration file:
providers:
- id: openai:gpt-4o-mini
config:
apiKey: your_api_key_here
Custom Integrations
promptfoo supports several types of custom integrations:
-
File-based providers:
providers:
- file://path/to/provider_config.yaml -
JavaScript providers:
providers:
- file://path/to/custom_provider.js -
Python providers:
providers:
- id: file://path/to/custom_provider.py -
HTTP/HTTPS API:
providers:
- id: https://api.example.com/v1/chat/completions
config:
headers:
Authorization: 'Bearer your_api_key' -
WebSocket:
providers:
- id: ws://example.com/ws
config:
messageTemplate: '{"prompt": "{{prompt}}"}' -
Custom scripts:
providers:
- 'exec: python chain.py'
Common Configuration Options
Many providers support these common configuration options:
temperature
: Controls randomness (0.0 to 1.0)max_tokens
: Maximum number of tokens to generatetop_p
: Nucleus sampling parameterfrequency_penalty
: Penalizes frequent tokenspresence_penalty
: Penalizes new tokens based on presence in textstop
: Sequences where the API will stop generating further tokens
Example:
providers:
- id: openai:gpt-4o-mini
config:
temperature: 0.7
max_tokens: 150
top_p: 0.9
frequency_penalty: 0.5
presence_penalty: 0.5
stop: ["\n", 'Human:', 'AI:']