Setting up promptfoo with Jenkins

This guide demonstrates how to integrate promptfoo's LLM evaluation into your Jenkins pipeline. This setup enables automatic testing of your prompts and models whenever changes are made to your repository.

Prerequisites

Jenkins server with pipeline support
Node.js installed on the Jenkins agent
Your LLM provider's API keys (e.g., OpenAI API key)
Basic familiarity with Jenkins Pipeline syntax

Configuration Steps

1. Create Jenkinsfile

Create a Jenkinsfile in your repository root. Here's a basic configuration that installs promptfoo and runs evaluations:

pipeline {
    agent any

    environment {
        OPENAI_API_KEY = credentials('openai-api-key')
        PROMPTFOO_CACHE_PATH = '~/.promptfoo/cache'
    }

    stages {
        stage('Setup') {
            steps {
                sh 'npm install -g promptfoo'
            }
        }

        stage('Evaluate Prompts') {
            steps {
                script {
                    try {
                        sh 'promptfoo eval -c promptfooconfig.yaml --prompts prompts/**/*.json --share -o output.json'
                    } catch (Exception e) {
                        currentBuild.result = 'FAILURE'
                        error("Prompt evaluation failed: ${e.message}")
                    }
                }
            }
        }

        stage('Process Results') {
            steps {
                script {
                    def output = readJSON file: 'output.json'
                    echo "Evaluation Results:"
                    echo "Successes: ${output.results.stats.successes}"
                    echo "Failures: ${output.results.stats.failures}"

                    if (output.shareableUrl) {
                        echo "View detailed results at: ${output.shareableUrl}"
                    }

                    if (output.results.stats.failures > 0) {
                        currentBuild.result = 'UNSTABLE'
                    }
                }
            }
        }
    }

    post {
        always {
            archiveArtifacts artifacts: 'output.json', fingerprint: true
        }
    }
}

2. Configure Jenkins Credentials

You'll need to add the API keys for any LLM providers you're using. For example, if you're using OpenAI, you'll need to add the OpenAI API key.

Navigate to Jenkins Dashboard → Manage Jenkins → Credentials
Add a new credential:
- Kind: Secret text
- Scope: Global
- ID: openai-api-key
- Description: OpenAI API Key
- Secret: Your API key value

3. Set Up Caching

To implement caching for better performance and reduced API costs:

Create a cache directory on your Jenkins agent:

mkdir -p ~/.promptfoo/cache

Ensure the Jenkins user has write permissions:

chown -R jenkins:jenkins ~/.promptfoo/cache

4. Advanced Pipeline Configuration

Here's an example of a more advanced pipeline with additional features:

The advanced configuration includes several important improvements:

Build timeouts: The timeout option ensures builds don't run indefinitely (1 hour limit)
Timestamps: Adds timestamps to console output for better debugging
SCM polling: Automatically checks for changes every 15 minutes using pollSCM
Conditional execution: Only runs evaluations when files in prompts/ directory change
Email notifications: Sends emails to developers on pipeline failures
Workspace cleanup: Automatically cleans up workspace after each run
Artifact management: Archives both JSON and HTML reports with fingerprinting
Better error handling: More robust error catching and build status management

pipeline {
    agent any

    environment {
        OPENAI_API_KEY = credentials('openai-api-key')
        PROMPTFOO_CACHE_PATH = '~/.promptfoo/cache'
    }

    options {
        timeout(time: 1, unit: 'HOURS')
        timestamps()
    }

    triggers {
        pollSCM('H/15 * * * *')
    }

    stages {
        stage('Setup') {
            steps {
                sh 'npm install -g promptfoo'
            }
        }

        stage('Evaluate Prompts') {
            when {
                changeset 'prompts/**'
            }
            steps {
                script {
                    try {
                        sh '''
                            promptfoo eval \
                                -c promptfooconfig.yaml \
                                --prompts prompts/**/*.json \
                                --share \
                                -o output.json
                        '''
                    } catch (Exception e) {
                        currentBuild.result = 'FAILURE'
                        error("Prompt evaluation failed: ${e.message}")
                    }
                }
            }
        }

        stage('Process Results') {
            steps {
                script {
                    def output = readJSON file: 'output.json'

                    // Create HTML report
                    writeFile file: 'evaluation-report.html', text: """
                        <html>
                            <body>
                                <h1>Prompt Evaluation Results</h1>
                                <p>Successes: ${output.results.stats.successes}</p>
                                <p>Failures: ${output.results.stats.failures}</p>
                                <p>View detailed results: <a href="${output.shareableUrl}">${output.shareableUrl}</a></p>
                            </body>
                        </html>
                    """

                    // Publish HTML report
                    publishHTML([
                        allowMissing: false,
                        alwaysLinkToLastBuild: true,
                        keepAll: true,
                        reportDir: '.',
                        reportFiles: 'evaluation-report.html',
                        reportName: 'Prompt Evaluation Report'
                    ])

                    if (output.results.stats.failures > 0) {
                        currentBuild.result = 'UNSTABLE'
                    }
                }
            }
        }
    }

    post {
        always {
            archiveArtifacts artifacts: 'output.json,evaluation-report.html', fingerprint: true
            cleanWs()
        }
        failure {
            emailext (
                subject: "Failed Pipeline: ${currentBuild.fullDisplayName}",
                body: "Prompt evaluation failed. Check console output at ${env.BUILD_URL}",
                recipientProviders: [[$class: 'DevelopersRecipientProvider']]
            )
        }
    }
}

Troubleshooting

Common issues and solutions:

Permission issues:
- Ensure Jenkins has appropriate permissions to install global npm packages
- Verify cache directory permissions
- Check API key credential permissions
Pipeline timeout:
- Adjust the timeout in pipeline options
- Consider splitting evaluations into smaller batches
- Monitor API rate limits
Cache problems:
- Verify cache path exists and is writable
- Check disk space availability
- Clear cache if needed: rm -rf ~/.promptfoo/cache/*
Node.js issues:
- Ensure Node.js is installed on the Jenkins agent
- Verify npm is available in PATH
- Consider using nodejs tool installer in Jenkins

For more information on promptfoo configuration and usage, refer to the configuration reference.

Prerequisites​

Configuration Steps​

1. Create Jenkinsfile​

2. Configure Jenkins Credentials​

3. Set Up Caching​

4. Advanced Pipeline Configuration​

Troubleshooting​