Setting up promptfoo with Jenkins
This guide demonstrates how to integrate promptfoo's LLM evaluation into your Jenkins pipeline. This setup enables automatic testing of your prompts and models whenever changes are made to your repository.
Prerequisites
- Jenkins server with pipeline support
- Node.js installed on the Jenkins agent
- Your LLM provider's API keys (e.g., OpenAI API key)
- Basic familiarity with Jenkins Pipeline syntax
Configuration Steps
1. Create Jenkinsfile
Create a Jenkinsfile
in your repository root. Here's a basic configuration that installs promptfoo and runs evaluations:
pipeline {
agent any
environment {
OPENAI_API_KEY = credentials('openai-api-key')
PROMPTFOO_CACHE_PATH = '~/.promptfoo/cache'
}
stages {
stage('Setup') {
steps {
sh 'npm install -g promptfoo'
}
}
stage('Evaluate Prompts') {
steps {
script {
try {
sh 'promptfoo eval -c promptfooconfig.yaml --prompts prompts/**/*.json --share -o output.json'
} catch (Exception e) {
currentBuild.result = 'FAILURE'
error("Prompt evaluation failed: ${e.message}")
}
}
}
}
stage('Process Results') {
steps {
script {
def output = readJSON file: 'output.json'
echo "Evaluation Results:"
echo "Successes: ${output.results.stats.successes}"
echo "Failures: ${output.results.stats.failures}"
if (output.shareableUrl) {
echo "View detailed results at: ${output.shareableUrl}"
}
if (output.results.stats.failures > 0) {
currentBuild.result = 'UNSTABLE'
}
}
}
}
}
post {
always {
archiveArtifacts artifacts: 'output.json', fingerprint: true
}
}
}
2. Configure Jenkins Credentials
You'll need to add the API keys for any LLM providers you're using. For example, if you're using OpenAI, you'll need to add the OpenAI API key.
- Navigate to Jenkins Dashboard → Manage Jenkins → Credentials
- Add a new credential:
- Kind: Secret text
- Scope: Global
- ID: openai-api-key
- Description: OpenAI API Key
- Secret: Your API key value
3. Set Up Caching
To implement caching for better performance and reduced API costs:
- Create a cache directory on your Jenkins agent:
mkdir -p ~/.promptfoo/cache
- Ensure the Jenkins user has write permissions:
chown -R jenkins:jenkins ~/.promptfoo/cache
4. Advanced Pipeline Configuration
Here's an example of a more advanced pipeline with additional features:
The advanced configuration includes several important improvements:
- Build timeouts: The
timeout
option ensures builds don't run indefinitely (1 hour limit) - Timestamps: Adds timestamps to console output for better debugging
- SCM polling: Automatically checks for changes every 15 minutes using
pollSCM
- Conditional execution: Only runs evaluations when files in
prompts/
directory change - Email notifications: Sends emails to developers on pipeline failures
- Workspace cleanup: Automatically cleans up workspace after each run
- Artifact management: Archives both JSON and HTML reports with fingerprinting
- Better error handling: More robust error catching and build status management
pipeline {
agent any
environment {
OPENAI_API_KEY = credentials('openai-api-key')
PROMPTFOO_CACHE_PATH = '~/.promptfoo/cache'
}
options {
timeout(time: 1, unit: 'HOURS')
timestamps()
}
triggers {
pollSCM('H/15 * * * *')
}
stages {
stage('Setup') {
steps {
sh 'npm install -g promptfoo'
}
}
stage('Evaluate Prompts') {
when {
changeset 'prompts/**'
}
steps {
script {
try {
sh '''
promptfoo eval \
-c promptfooconfig.yaml \
--prompts prompts/**/*.json \
--share \
-o output.json
'''
} catch (Exception e) {
currentBuild.result = 'FAILURE'
error("Prompt evaluation failed: ${e.message}")
}
}
}
}
stage('Process Results') {
steps {
script {
def output = readJSON file: 'output.json'
// Create HTML report
writeFile file: 'evaluation-report.html', text: """
<html>
<body>
<h1>Prompt Evaluation Results</h1>
<p>Successes: ${output.results.stats.successes}</p>
<p>Failures: ${output.results.stats.failures}</p>
<p>View detailed results: <a href="${output.shareableUrl}">${output.shareableUrl}</a></p>
</body>
</html>
"""
// Publish HTML report
publishHTML([
allowMissing: false,
alwaysLinkToLastBuild: true,
keepAll: true,
reportDir: '.',
reportFiles: 'evaluation-report.html',
reportName: 'Prompt Evaluation Report'
])
if (output.results.stats.failures > 0) {
currentBuild.result = 'UNSTABLE'
}
}
}
}
}
post {
always {
archiveArtifacts artifacts: 'output.json,evaluation-report.html', fingerprint: true
cleanWs()
}
failure {
emailext (
subject: "Failed Pipeline: ${currentBuild.fullDisplayName}",
body: "Prompt evaluation failed. Check console output at ${env.BUILD_URL}",
recipientProviders: [[$class: 'DevelopersRecipientProvider']]
)
}
}
}
Troubleshooting
Common issues and solutions:
-
Permission issues:
- Ensure Jenkins has appropriate permissions to install global npm packages
- Verify cache directory permissions
- Check API key credential permissions
-
Pipeline timeout:
- Adjust the timeout in pipeline options
- Consider splitting evaluations into smaller batches
- Monitor API rate limits
-
Cache problems:
- Verify cache path exists and is writable
- Check disk space availability
- Clear cache if needed:
rm -rf ~/.promptfoo/cache/*
-
Node.js issues:
- Ensure Node.js is installed on the Jenkins agent
- Verify npm is available in PATH
- Consider using
nodejs
tool installer in Jenkins
For more information on promptfoo configuration and usage, refer to the configuration reference.