Model-graded Closed QA

model-graded-closedqa is a criteria-checking evaluation that uses OpenAI's public evals prompt to determine if an LLM output meets specific requirements.

How to use it

To use the model-graded-closedqa assertion type, add it to your test configuration like this:

assert:
  - type: model-graded-closedqa
    # Specify the criteria that the output must meet:
    value: Provides a clear answer without hedging or uncertainty

This assertion will use a language model to evaluate whether the output meets the specified criterion, returning a simple yes/no response.

How it works

Under the hood, model-graded-closedqa uses OpenAI's closed QA evaluation prompt to analyze the output. The grader will return:

Y if the output meets the criterion
N if the output does not meet the criterion

The assertion passes if the response ends with 'Y' and fails if it ends with 'N'.

Example Configuration

Here's a complete example showing how to use model-graded-closedqa:

prompts:
  - 'What is {{topic}}?'
providers:
  - openai:gpt-4
tests:
  - vars:
      topic: quantum computing
    assert:
      - type: model-graded-closedqa
        value: Explains the concept without using technical jargon
      - type: model-graded-closedqa
        value: Includes a practical real-world example

Overriding the Grader

Like other model-graded assertions, you can override the default grader:

Using the CLI:

promptfoo eval --grader openai:gpt-4.1-mini

Using test options:

defaultTest:
  options:
    provider: openai:gpt-4.1-mini

Using assertion-level override:

assert:
  - type: model-graded-closedqa
    value: Is concise and clear
    provider: openai:gpt-4.1-mini

Customizing the Prompt

You can customize the evaluation prompt using the rubricPrompt property:

defaultTest:
  options:
    rubricPrompt: |
      Question: {{input}}
      Criterion: {{criteria}}
      Response: {{completion}}

      Does this response meet the criterion? Answer Y or N.

How to use it​

How it works​

Example Configuration​

Overriding the Grader​

Customizing the Prompt​

Further reading

How to use it

How it works

Example Configuration

Overriding the Grader

Customizing the Prompt