Example Conversational AI Project

This guide will walk you through creating a web application that implements ElevenLabs Conversational AI using the @11labs/client SDK. You’ll build a simple interface that allows users to have voice conversations with an AI agent.

Prerequisites

Node.js installed on your system
Basic knowledge of JavaScript/HTML/CSS
An API key & Agent ID from ElevenLabs platform

Project Setup

Create a new directory for your project:

mkdir elevenlabs-convai-demo
cd elevenlabs-convai-demo

Initialize a new Node.js project:

npm init -y

Install the required dependencies:

npm install @11labs/client cors dotenv express

Install development dependencies:

npm install --save-dev webpack webpack-cli webpack-dev-server copy-webpack-plugin concurrently

Project Structure

Create the following directory structure:

elevenlabs-convai-demo/
│── src/
│   ├── app.js
│   ├── index.html
│   └── styles.css
│── backend/
│   └── server.js
│── webpack.config.js
│── package.json
│── .env

Environment Configuration

You’ll find .env.example in the github project root directory:

XI_API_KEY=your_elevenlabs_api_key
AGENT_ID=your_agent_id
PORT=3000

Create your .env file:
- Copy .env.example to .env
- Replace the placeholder values with your actual credentials
- Make sure .env is listed in your .gitignore to keep your credentials secure

⚠️ Security Note: Never commit your .env file or share your API credentials. Always keep them private and secure.

Set up webpack.config.js:

const path = require('path');
const CopyPlugin = require('copy-webpack-plugin');

module.exports = {
    entry: './src/app.js',
    output: {
        filename: 'bundle.js',
        path: path.resolve(__dirname, 'dist'),
        publicPath: '/'
    },
    devServer: {
        static: {
            directory: path.join(__dirname, 'dist'),
        },
        port: 8080,
        proxy: {
            '/api': 'http://localhost:3000'
        },
        hot: true
    },
    plugins: [
        new CopyPlugin({
            patterns: [
                { from: 'src/index.html', to: 'index.html' },
                { from: 'src/styles.css', to: 'styles.css' }
            ],
        }),
    ]
};

Frontend Implementation

Let’s build our user interface step by step.

Basic UI Structure

First, we create a simple interface in index.html with these key components:

Status indicators for connection and speaking states
Start/End conversation buttons

Here’s the basic HTML structure:

<div class="status-container">
    <div id="connectionStatus" class="status">Disconnected</div>
    <div id="speakingStatus" class="speaking-status">Agent Silent</div>
</div>

<div class="controls">
    <button id="startButton" class="button">Start Conversation</button>
    <button id="endButton" class="button" disabled>End Conversation</button>
    
</div>

Without any styling, the interface looks basic but functional:

Adding Visual Polish

Let’s enhance the UI with CSS to make it more user-friendly and visually appealing. We’re using raw CSS here, but you can use alternatives like Tailwind CSS or shadcn/ui based on your preference.

Key styling highlights:

body {
    font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
    margin: 0;
    padding: 20px;
    background-color: #f5f5f5;
}

.container {
    max-width: 800px;
    margin: 0 auto;
    background-color: white;
    padding: 20px;
    border-radius: 10px;
    box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
}

h1 {
    text-align: center;
    color: #333;
}

.status-container {
    display: flex;
    justify-content: center;
    gap: 20px;
    margin-bottom: 20px;
}

.status, .speaking-status {
    padding: 8px 16px;
    border-radius: 20px;
    font-size: 14px;
}

.status {
    background-color: #ff4444;
    color: white;
}

.status.connected {
    background-color: #00C851;
}

.speaking-status {
    background-color: #eee;
}

.speaking-status.speaking {
    background-color: #33b5e5;
    color: white;
}

.controls {
    display: flex;
    flex-direction: column;
    gap: 10px;
    align-items: center;
    margin-bottom: 20px;
}

.button {
    padding: 10px 20px;
    border: none;
    border-radius: 5px;
    cursor: pointer;
    font-size: 16px;
    transition: background-color 0.3s;
}

.button:disabled {
    opacity: 0.5;
    cursor: not-allowed;
}

#startButton {
    background-color: #00C851;
    color: white;
}

#endButton {
    background-color: #ff4444;
    color: white;
}

This is how it will look like after styling :

Core Application Logic

Now let’s build the application logic piece by piece in app.js:

Microphone Permission Handler

async function requestMicrophonePermission() {
    try {
        await navigator.mediaDevices.getUserMedia({ audio: true });
        return true;
    } catch (error) {
        console.error('Microphone permission denied:', error);
        return false;
    }
}

This function requests microphone access from the user. We need this before starting any conversation to ensure we can capture the user’s voice.

Signed URL Retrieval

async function getSignedUrl() {
    try {
        const response = await fetch('/api/signed-url');
        if (!response.ok) throw new Error('Failed to get signed URL');
        const data = await response.json();
        return data.signedUrl;
    } catch (error) {
        console.error('Error getting signed URL:', error);
        throw error;
    }
}

The signed URL contains a short lived token that can be used to start the conversation, without exposing your ElevenLabs API key to your end users.

Status Management

function updateStatus(isConnected) {
    const statusElement = document.getElementById('connectionStatus');
    statusElement.textContent = isConnected ? 'Connected' : 'Disconnected';
    statusElement.classList.toggle('connected', isConnected);
}

function updateSpeakingStatus(mode) {
    const statusElement = document.getElementById('speakingStatus');
    const isSpeaking = mode.mode === 'speaking';
    statusElement.textContent = isSpeaking ? 'Agent Speaking' : 'Agent Silent';
    statusElement.classList.toggle('speaking', isSpeaking);
}

These functions handle the UI updates for connection and speaking states. They provide visual feedback to users about the current state of the conversation.

Conversation Initialization

async function startConversation() {
    // First, check for microphone permission
    const hasPermission = await requestMicrophonePermission();
    if (!hasPermission) {
        alert('Microphone permission is required for the conversation.');
        return;
    }

    // Get the signed URL for authentication
    const signedUrl = await getSignedUrl();
    
    // Initialize the conversation with the ElevenLabs SDK
    conversation = await Conversation.startSession({
        signedUrl,
        onConnect: () => {
            // Handle successful connection
            updateStatus(true);
            toggleButtons(true);
        },
        onDisconnect: () => {
            // Handle disconnection
            updateStatus(false);
            toggleButtons(false);
            updateSpeakingStatus({ mode: 'listening' });
        },
        onError: (error) => {
            // Handle errors during conversation
            console.error('Conversation error:', error);
            alert('An error occurred during the conversation.');
        },
        onModeChange: (mode) => {
            // Update UI when agent switches between speaking and listening
            updateSpeakingStatus(mode);
        }
    });
}

This function orchestrates the conversation startup process.

Conversation Control

async function endConversation() {
    if (conversation) {
        await conversation.endSession();
        conversation = null;
    }
}

A simple function to properly end the conversation session and clean up resources.

Backend Implementation

The backend server handles secure communication with the ElevenLabs API and serves your frontend application. Let’s build it step by step:

1. Setting Up Dependencies

First, we need to import our required packages:

const express = require('express');
const cors = require('cors');
const dotenv = require('dotenv');
const path = require('path');

dotenv.config();

These imports provide:

express: Web framework for handling HTTP requests
cors: Middleware for enabling Cross-Origin Resource Sharing
dotenv: Loads environment variables from .env file
path: Utilities for handling file paths

2. Configuring Express Server

const app = express();
app.use(cors());
app.use(express.json());
app.use(express.static(path.join(__dirname, '../dist')));

This configuration:

Creates an Express application
Enables CORS for development (allows frontend to communicate with backend)
Adds JSON parsing middleware for handling request bodies
Serves the frontend files from the ‘dist’ directory

3. Creating the Signed URL Endpoint

app.get('/api/signed-url', async (req, res) => {
    try {
        const response = await fetch(
            `https://api.elevenlabs.io/v1/convai/conversation/get_signed_url?agent_id=${process.env.AGENT_ID}`,
            {
                method: 'GET',
                headers: {
                    'xi-api-key': process.env.XI_API_KEY,
                }
            }
        );

        if (!response.ok) {
            throw new Error('Failed to get signed URL');
        }

        const data = await response.json();
        res.json({ signedUrl: data.signed_url });
    } catch (error) {
        console.error('Error:', error);
        res.status(500).json({ error: 'Failed to get signed URL' });
    }
});

The signed URL is essential because:

It provides secure, temporary access to the conversational API
Keeps your API key secure on the backend
Allows the frontend to connect without exposing sensitive credentials

4. Adding a Catch-all Route

app.get('*', (req, res) => {
    res.sendFile(path.join(__dirname, '../dist/index.html'));
});

This route:

Catches any unmatched routes
Serves the main index.html file
Enables client-side routing in your single-page application

5. Starting the Server

const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
    console.log(`Server running on port ${PORT}`);
});

This final step:

Sets the server port (from environment variables or defaults to 3000)
Starts the Express server
Logs a confirmation message when the server is running

Running the Application

Add the following scripts to your package.json:

{
  "scripts": {
    "start:backend": "node backend/server.js",
    "build": "webpack --mode production",
    "dev": "concurrently \"npm run start:backend\" \"webpack serve --mode development\"",
    "start": "npm run build && npm run start:backend"
  }
}

Start the development server:

npm run dev

Your application will now be running on:

Frontend: http://localhost:8080
Backend: http://localhost:3000

Key Features

Microphone Permission Handling: The application requests and verifies microphone access before starting a conversation.
Connection Status Display: Visual indicators show whether the application is connected and if the AI agent is speaking.
Error Handling: The application includes comprehensive error handling for both the frontend and backend.

Important Notes

Make sure to keep your API key secure and never expose it in the frontend code
The application requires HTTPS in production for microphone access
Always handle errors appropriately to provide a good user experience
Test microphone permissions and audio playback thoroughly

Troubleshooting

Microphone Access Issues:
- Ensure your browser has permission to access the microphone
- Check if your microphone is properly connected and working
- Try using HTTPS in production environments
Connection Problems:
- Verify your API key and Agent ID are correct
- Check your internet connection
- Look for any CORS-related errors in the console
Audio Issues:
- Ensure your browser’s audio output is working
- Verify that no other applications are blocking audio output

Conversational AI SDKs

Guides

Example Conversational AI Project

Prerequisites

Project Setup

Project Structure

Environment Configuration

Frontend Implementation

Basic UI Structure

Adding Visual Polish

Core Application Logic

Backend Implementation

1. Setting Up Dependencies

2. Configuring Express Server

3. Creating the Signed URL Endpoint

4. Adding a Catch-all Route

5. Starting the Server

Running the Application

Key Features

Important Notes

Troubleshooting

Conversational AI SDKs

Guides

​Prerequisites

​Project Setup

​Project Structure

​Environment Configuration

​Frontend Implementation

​Basic UI Structure

​Adding Visual Polish

​Core Application Logic

​Backend Implementation

​1. Setting Up Dependencies

​2. Configuring Express Server

​3. Creating the Signed URL Endpoint

​4. Adding a Catch-all Route

​5. Starting the Server

​Running the Application

​Key Features

​Important Notes

​Troubleshooting

Prerequisites

Project Setup

Project Structure

Environment Configuration

Frontend Implementation

Basic UI Structure

Adding Visual Polish

Core Application Logic

Backend Implementation

1. Setting Up Dependencies

2. Configuring Express Server

3. Creating the Signed URL Endpoint

4. Adding a Catch-all Route

5. Starting the Server

Running the Application

Key Features

Important Notes

Troubleshooting