Similarity (embeddings)

The similar assertion checks if an embedding of the LLM's output is semantically similar to the expected value, using a cosine similarity threshold.

By default, embeddings are computed via OpenAI's text-embedding-3-large model.

Example:

assert:
  - type: similar
    value: 'The expected output'
    threshold: 0.8

If you provide an array of values, the test will pass if it is similar to at least one of them:

assert:
  - type: similar
    value:
      - The expected output
      - Expected output
      - file://my_expected_output.txt
    threshold: 0.8

Overriding the provider

By default similar will use OpenAI. To specify the model that creates the embeddings, do one of the following:

Use test.options or defaultTest.options to override the provider across the entire test suite. For example:

defaultTest:
  options:
    provider:
      embedding:
        id: azureopenai:embedding:text-embedding-ada-002
        config:
          apiHost: xxx.openai.azure.com
tests:
  assert:
    - type: similar
      value: Hello world

Set assertion.provider on a per-assertion basis. For example:

tests:
  assert:
    - type: similar
      value: Hello world
      provider: huggingface:sentence-similarity:sentence-transformers/all-MiniLM-L6-v2

Overriding the provider​

Overriding the provider