For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Connect
BlogHelp CenterAPI PricingSign up
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
  • Get started
    • Overview
    • Quickstart
  • Configure
    • Overview
    • Voice & language
    • Knowledge base
    • Tools
    • Personalization
    • Authentication
  • Deploy
    • Overview
    • Environment variables
    • WhatsApp
    • Batch calls
  • Monitor
    • Overview
    • Users
    • Testing
    • Experiments
    • Versioning
    • Conversation Analysis
      • Success Evaluation
      • Data Collection
      • Coaching
      • Searching conversations
    • Analytics
    • Real-time monitoring
    • OpenTelemetry traces
    • Privacy
    • Cost optimization
    • CLI
  • Advanced
    • Events
    • Custom models
    • LLM cascading
    • Post-call webhooks
  • Resources
    • UI components
  • Guides
    • Chat Mode
    • Burst pricing
    • ElevenLabs' docs agent
    • Scaling user interviews
    • Simulate Conversations
LogoLogo
Login
Login
Connect
BlogHelp CenterAPI PricingSign up
On this page
  • Overview
  • How It Works
  • Types of Evaluation Criteria
  • Configuration
  • Best Practices
  • Use Cases
  • Troubleshooting
MonitorConversation Analysis

Success Evaluation

Define custom criteria to assess conversation quality, goal achievement, and customer satisfaction.
Was this page helpful?
Previous

Data Collection

Extract structured information from conversations such as contact details and business data.
Next
Built with

Success evaluation allows you to define custom goals and success metrics for your conversations. Each criterion is evaluated against the conversation transcript and returns a result of success, failure, or unknown, along with a detailed rationale.

Overview

Success evaluation uses LLM-powered analysis to assess conversation quality against your specific business objectives. This enables systematic performance measurement and quality assurance across all customer interactions.

How It Works

Each evaluation criterion analyzes the conversation transcript using a custom prompt and returns:

  • Result: success, failure, or unknown
  • Rationale: Detailed explanation of why the result was chosen

Types of Evaluation Criteria

Goal Prompt Criteria

Goal prompt criteria pass the conversation transcript along with a custom prompt to an LLM to verify if a specific goal was met. This is the most flexible type of evaluation and can be used for complex business logic.

Examples:

  • Customer satisfaction assessment
  • Issue resolution verification
  • Compliance checking
  • Custom business rule validation

Configuration

1

Access agent settings

Navigate to your agent’s dashboard and select the Analysis tab to configure evaluation criteria.

Analysis settings

2

Add evaluation criteria

Click Add criteria to create a new evaluation criterion.

Define your criterion with:

  • Identifier: A unique name for the criterion (e.g., user_was_not_upset)
  • Description: Detailed prompt describing what should be evaluated

Setting up evaluation criteria

Evaluation criteria are limited to 30 per agent.

3

View results

After conversations complete, evaluation results appear in your conversation history dashboard. Each conversation shows the evaluation outcome and rationale for every configured criterion.

Evaluation results in conversation history

Best Practices

Writing effective evaluation prompts
  • Be specific about what constitutes success vs. failure
  • Include edge cases and examples in your prompt
  • Use clear, measurable criteria when possible
  • Test your prompts with various conversation scenarios
Common evaluation criteria
  • Customer satisfaction: “Mark as successful if the customer expresses satisfaction or their issue was resolved” - Goal completion: “Mark as successful if the customer completed the requested action (booking, purchase, etc.)” - Compliance: “Mark as successful if the agent followed all required compliance procedures” - Issue resolution: “Mark as successful if the customer’s technical issue was resolved during the call”
Handling ambiguous results

The unknown result is returned when the LLM cannot determine success or failure from the transcript. This often happens with:

  • Incomplete conversations
  • Ambiguous customer responses
  • Missing information in the transcript

Monitor unknown results to identify areas where your criteria prompts may need refinement.

Use Cases

Customer Support Quality

Measure issue resolution rates, customer satisfaction, and support quality metrics to improve service delivery.

Sales Performance

Track goal achievement, objection handling, and conversion rates across sales conversations.

Compliance Monitoring

Ensure agents follow required procedures and capture necessary consent or disclosure confirmations.

Training & Development

Identify coaching opportunities and measure improvement in agent performance over time.

Troubleshooting

Evaluation criteria returning unexpected results
  • Review your prompt for clarity and specificity
  • Test with sample conversations to validate logic
  • Consider edge cases in your evaluation criteria
  • Check if the transcript contains sufficient information for evaluation
High frequency of 'unknown' results
  • Ensure your prompts are specific about what information to look for - Consider if conversations contain enough context for evaluation - Review transcript quality and completeness - Adjust criteria to handle common edge cases
Performance considerations
  • Each evaluation criterion adds processing time to conversation analysis
  • Complex prompts may take longer to evaluate
  • Consider the trade-off between comprehensive analysis and response time
  • Monitor your usage to optimize for your specific needs

Success evaluation results are available through Post-call Webhooks for integration with external systems and analytics platforms.