Testing LLM chains
A "chain" is defined by a list of LLM prompts that are executed sequentially (and sometimes conditionally). The output of each LLM call is parsed/manipulated/executed, and then the result is fed into the next prompt.
This page explains how to test an LLM chain. At a high level, you have these options:
Break the chain into separate calls, and test those. This is useful if your testing strategy is closer to unit tests, rather than end to end tests.
Test the full end-to-end chain, with a single input and single output. This is useful if you only care about the end result, and are not interested in how the LLM chain got there.
Unit testing LLM chains
As mentioned above, the easiest way to test is one prompt at a time. This can be done pretty easily with a basic promptfoo configuration.
npx promptfoo@latest init chain_step_X to create the test harness for the first step of your chain. After configuring test cases for that step, create a new set of test cases for step 2 and so on.