Red teaming a Chatbase Chatbot

Chatbase is a platform for building custom AI chatbots that can be embedded into websites for customer support, lead generation, and user engagement. These chatbots use RAG (Retrieval-Augmented Generation) to access your organization's knowledge base and maintain conversations with users.

Multi-turn vs Single-turn Testing

Single-turn Systems

Many LLM applications process each query independently, treating every interaction as a new conversation. Like talking to someone with no memory of previous exchanges, they can answer your current question but don't retain context from earlier messages.

This makes single-turn systems inherently more secure since attackers can't manipulate conversation history. However, this security comes at the cost of usability - users must provide complete context with every message, making interactions cumbersome.

Multi-turn Systems (Like Chatbase)

Modern conversational AI, including Chatbase, maintains context throughout the interaction. When users ask follow-up questions, the system understands the context from previous messages, enabling natural dialogue.

In Promptfoo, this state is managed through a conversationId that links messages together. While this enables a better user experience, it introduces security challenges. Attackers might try to manipulate the conversation context across multiple messages, either building false premises or attempting to extract sensitive information.

Initial Setup

Prerequisites

Node.js 18+
promptfoo CLI (npm install -g promptfoo)
Chatbase API credentials:
- API Bearer Token (from your Chatbase dashboard)
- Chatbot ID (found in your bot's settings)

Basic Configuration

Initialize the red team testing environment:

promptfoo redteam init

Configure your Chatbase target in the setup UI. Your configuration file should look similar to this:

targets:
  - id: 'http'
    config:
      method: 'POST'
      url: 'https://www.chatbase.co/api/v1/chat'
      headers:
        'Content-Type': 'application/json'
        'Authorization': 'Bearer YOUR_API_TOKEN'
      body:
        {
          'messages': '{{prompt}}',
          'chatbotId': 'YOUR_CHATBOT_ID',
          'stream': false,
          'temperature': 0,
          'model': 'gpt-4.1-mini',
          'conversationId': '{{conversationId}}',
        }
      transformResponse: 'json.text'
      transformRequest: '[{ role: "user", content: prompt }]'
defaultTest:
  options:
    transformVars: '{ ...vars, conversationId: context.uuid }'

Configuration Notes

Configure both the transformRequest and transformResponse for your chatbot:
- transformRequest: Formats the request as OpenAI-compatible messages
- transformResponse: Extracts the response text from the JSON body
The context.uuid generates a unique conversation ID for each test, enabling Chatbase to track conversation state across multiple messages.

Strategy Configuration

Enable multi-turn testing strategies in your promptfooconfig.yaml:

strategies:
  - id: 'goat'
    config:
      stateful: true
  - id: 'crescendo'
    config:
      stateful: true
  - id: 'mischievous-user'
    config:
      stateful: true

Test Execution

Run your tests with these commands:

# Generate test cases
promptfoo redteam generate

# Execute evaluation
promptfoo redteam eval

# View detailed results in the web UI
promptfoo view

Common issues and solutions

If you encounter issues:

If tests fail to connect, verify your API credentials
If the message content is garbled, verify your request parser and response parser are correct.

Multi-turn vs Single-turn Testing​

Single-turn Systems​

Multi-turn Systems (Like Chatbase)​

Initial Setup​

Prerequisites​

Basic Configuration​

Strategy Configuration​

Test Execution​

Common issues and solutions​

Additional Resources​