Skip to main content

Bedrock

The bedrock lets you use Amazon Bedrock in your evals. This is a common way to access Anthropic's Claude and other models.

Setup

First, ensure that you have access to the desired models under the Providers page in Amazon Bedrock.

Next, install @aws-sdk/client-bedrock-runtime:

npm install -g @aws-sdk/client-bedrock-runtime

The AWS SDK will automatically pull credentails from the following locations:

Finally, edit your configuration file to point to the AWS Bedrock provider. Here's an example:

providers:
- bedrock:anthropic.claude-3-haiku-20240307-v1:0

Additional config parameters are passed like so:

providers:
- id: bedrock:anthropic.claude-3-haiku-20240307-v1:0
config:
region: 'us-west-2'
temperature: 0.7
max_tokens: 256

Example

See Github for full examples of Claude and Titan model usage.

prompts:
- 'Write a tweet about {{topic}}'

providers:
- id: bedrock:anthropic.claude-instant-v1
config:
region: 'us-east-1'
temperature: 0.7
max_tokens_to_sample: 256
- id: bedrock:anthropic.claude-3-haiku-20240307-v1:0
config:
region: 'us-east-1'
temperature: 0.7
max_tokens: 256
- id: bedrock:anthropic.claude-3-sonnet-20240229-v1:0
config:
region: 'us-east-1'
temperature: 0.7
max_tokens: 256

tests:
- vars:
topic: Our eco-friendly packaging
- vars:
topic: A sneak peek at our secret menu item
- vars:
topic: Behind-the-scenes at our latest photoshoot

Model-graded tests

By default, model-graded tests use OpenAI and require the OPENAI_API_KEY environment variable to be set. When using AWS Bedrock, you have the option of overriding the grader for model-graded assertions to point to AWS Bedrock, or other providers.

Note that because of how model-graded evals are implemented, the LLM grading models must support chat-formatted prompts (except for embedding or classification models).

The easiest way to do this for all your test cases is to add the defaultTest property to your config:

defaultTest:
options:
provider:
id: provider:chat:modelname
config:
# Provider config options

But you can also do this for individual assertions:

# ...
assert:
- type: llm-rubric
value: Do not mention that you are an AI or chat assistant
provider:
id: provider:chat:modelname
config:
# Provider config options

Or individual tests:

# ...
tests:
- vars:
# ...
options:
provider:
id: provider:chat:modelname
config:
# Provider config options
assert:
- type: llm-rubric
value: Do not mention that you are an AI or chat assistant