This guide will walk you through creating a web application that implements ElevenLabs Conversational AI using the @11labs/client SDK. You’ll build a simple interface that allows users to have voice conversations with an AI agent.

Node.js installed on your system

Basic knowledge of JavaScript/HTML/CSS

An API key & Agent ID from ElevenLabs platform

​ Project Setup

Create a new directory for your project:

mkdir elevenlabs-convai-demo cd elevenlabs-convai-demo

Initialize a new Node.js project:

npm init -y

Install the required dependencies:

npm install @11labs/client cors dotenv express

Install development dependencies:

npm install --save-dev webpack webpack-cli webpack-dev-server copy-webpack-plugin concurrently

​ Project Structure

Create the following directory structure:

elevenlabs-convai-demo/ │── src/ │ ├── app.js │ ├── index.html │ └── styles.css │── backend/ │ └── server.js │── webpack.config.js │── package.json │── .env

​ Environment Configuration

You’ll find .env.example in the github project root directory:

XI_API_KEY=your_elevenlabs_api_key AGENT_ID=your_agent_id PORT=3000

Create your .env file: Copy .env.example to .env

to Replace the placeholder values with your actual credentials

Make sure .env is listed in your .gitignore to keep your credentials secure

⚠️ Security Note: Never commit your .env file or share your API credentials. Always keep them private and secure.

Set up webpack.config.js :

const path = require ( 'path' ) ; const CopyPlugin = require ( 'copy-webpack-plugin' ) ; module . exports = { entry : './src/app.js' , output : { filename : 'bundle.js' , path : path . resolve ( __dirname , 'dist' ) , publicPath : '/' } , devServer : { static : { directory : path . join ( __dirname , 'dist' ) , } , port : 8080 , proxy : { '/api' : 'http://localhost:3000' } , hot : true } , plugins : [ new CopyPlugin ( { patterns : [ { from : 'src/index.html' , to : 'index.html' } , { from : 'src/styles.css' , to : 'styles.css' } ] , } ) , ] } ;

​ Frontend Implementation

Let’s build our user interface step by step.

​ Basic UI Structure

First, we create a simple interface in index.html with these key components:

Status indicators for connection and speaking states

Start/End conversation buttons

Here’s the basic HTML structure:

< div class = " status-container " > < div id = " connectionStatus " class = " status " > Disconnected </ div > < div id = " speakingStatus " class = " speaking-status " > Agent Silent </ div > </ div > < div class = " controls " > < button id = " startButton " class = " button " > Start Conversation </ button > < button id = " endButton " class = " button " disabled > End Conversation </ button > </ div >

Without any styling, the interface looks basic but functional:

​ Adding Visual Polish

Let’s enhance the UI with CSS to make it more user-friendly and visually appealing. We’re using raw CSS here, but you can use alternatives like Tailwind CSS or shadcn/ui based on your preference.

Key styling highlights:

Show styles.css body { font-family : 'Segoe UI' , Tahoma , Geneva , Verdana , sans-serif ; margin : 0 ; padding : 20 px ; background-color : #f5f5f5 ; } .container { max-width : 800 px ; margin : 0 auto ; background-color : white ; padding : 20 px ; border-radius : 10 px ; box-shadow : 0 2 px 4 px rgba ( 0 , 0 , 0 , 0.1 ) ; } h1 { text-align : center ; color : #333 ; } .status-container { display : flex ; justify-content : center ; gap : 20 px ; margin-bottom : 20 px ; } .status , .speaking-status { padding : 8 px 16 px ; border-radius : 20 px ; font-size : 14 px ; } .status { background-color : #ff4444 ; color : white ; } .status .connected { background-color : #00C851 ; } .speaking-status { background-color : #eee ; } .speaking-status .speaking { background-color : #33b5e5 ; color : white ; } .controls { display : flex ; flex-direction : column ; gap : 10 px ; align-items : center ; margin-bottom : 20 px ; } .button { padding : 10 px 20 px ; border : none ; border-radius : 5 px ; cursor : pointer ; font-size : 16 px ; transition : background-color 0.3 s ; } .button :disabled { opacity : 0.5 ; cursor : not-allowed ; } #startButton { background-color : #00C851 ; color : white ; } #endButton { background-color : #ff4444 ; color : white ; }

This is how it will look like after styling :

​ Core Application Logic

Now let’s build the application logic piece by piece in app.js :

Microphone Permission Handler

async function requestMicrophonePermission ( ) { try { await navigator . mediaDevices . getUserMedia ( { audio : true } ) ; return true ; } catch ( error ) { console . error ( 'Microphone permission denied:' , error ) ; return false ; } }

This function requests microphone access from the user. We need this before starting any conversation to ensure we can capture the user’s voice.

Signed URL Retrieval

async function getSignedUrl ( ) { try { const response = await fetch ( '/api/signed-url' ) ; if ( ! response . ok ) throw new Error ( 'Failed to get signed URL' ) ; const data = await response . json ( ) ; return data . signedUrl ; } catch ( error ) { console . error ( 'Error getting signed URL:' , error ) ; throw error ; } }

The signed URL contains a short lived token that can be used to start the conversation, without exposing your ElevenLabs API key to your end users.

Status Management

function updateStatus ( isConnected ) { const statusElement = document . getElementById ( 'connectionStatus' ) ; statusElement . textContent = isConnected ? 'Connected' : 'Disconnected' ; statusElement . classList . toggle ( 'connected' , isConnected ) ; } function updateSpeakingStatus ( mode ) { const statusElement = document . getElementById ( 'speakingStatus' ) ; const isSpeaking = mode . mode === 'speaking' ; statusElement . textContent = isSpeaking ? 'Agent Speaking' : 'Agent Silent' ; statusElement . classList . toggle ( 'speaking' , isSpeaking ) ; }

These functions handle the UI updates for connection and speaking states. They provide visual feedback to users about the current state of the conversation.

Conversation Initialization

async function startConversation ( ) { const hasPermission = await requestMicrophonePermission ( ) ; if ( ! hasPermission ) { alert ( 'Microphone permission is required for the conversation.' ) ; return ; } const signedUrl = await getSignedUrl ( ) ; conversation = await Conversation . startSession ( { signedUrl , onConnect : ( ) => { updateStatus ( true ) ; toggleButtons ( true ) ; } , onDisconnect : ( ) => { updateStatus ( false ) ; toggleButtons ( false ) ; updateSpeakingStatus ( { mode : 'listening' } ) ; } , onError : ( error ) => { console . error ( 'Conversation error:' , error ) ; alert ( 'An error occurred during the conversation.' ) ; } , onModeChange : ( mode ) => { updateSpeakingStatus ( mode ) ; } } ) ; }

This function orchestrates the conversation startup process.

Conversation Control

async function endConversation ( ) { if ( conversation ) { await conversation . endSession ( ) ; conversation = null ; } }

A simple function to properly end the conversation session and clean up resources.

​ Backend Implementation

The backend server handles secure communication with the ElevenLabs API and serves your frontend application. Let’s build it step by step:

​ 1. Setting Up Dependencies

First, we need to import our required packages:

const express = require ( 'express' ) ; const cors = require ( 'cors' ) ; const dotenv = require ( 'dotenv' ) ; const path = require ( 'path' ) ; dotenv . config ( ) ;

These imports provide:

express : Web framework for handling HTTP requests

: Web framework for handling HTTP requests cors : Middleware for enabling Cross-Origin Resource Sharing

: Middleware for enabling Cross-Origin Resource Sharing dotenv : Loads environment variables from .env file

: Loads environment variables from .env file path : Utilities for handling file paths

​ 2. Configuring Express Server

const app = express ( ) ; app . use ( cors ( ) ) ; app . use ( express . json ( ) ) ; app . use ( express . static ( path . join ( __dirname , '../dist' ) ) ) ;

This configuration:

Creates an Express application

Enables CORS for development (allows frontend to communicate with backend)

Adds JSON parsing middleware for handling request bodies

Serves the frontend files from the ‘dist’ directory

​ 3. Creating the Signed URL Endpoint

app . get ( '/api/signed-url' , async ( req , res ) => { try { const response = await fetch ( ` https://api.elevenlabs.io/v1/convai/conversation/get_signed_url?agent_id= ${ process . env . AGENT_ID } ` , { method : 'GET' , headers : { 'xi-api-key' : process . env . XI_API_KEY , } } ) ; if ( ! response . ok ) { throw new Error ( 'Failed to get signed URL' ) ; } const data = await response . json ( ) ; res . json ( { signedUrl : data . signed_url } ) ; } catch ( error ) { console . error ( 'Error:' , error ) ; res . status ( 500 ) . json ( { error : 'Failed to get signed URL' } ) ; } } ) ;

The signed URL is essential because:

It provides secure, temporary access to the conversational API

Keeps your API key secure on the backend

Allows the frontend to connect without exposing sensitive credentials

​ 4. Adding a Catch-all Route

app . get ( '*' , ( req , res ) => { res . sendFile ( path . join ( __dirname , '../dist/index.html' ) ) ; } ) ;

This route:

Catches any unmatched routes

Serves the main index.html file

Enables client-side routing in your single-page application

​ 5. Starting the Server

const PORT = process . env . PORT || 3000 ; app . listen ( PORT , ( ) => { console . log ( ` Server running on port ${ PORT } ` ) ; } ) ;

This final step:

Sets the server port (from environment variables or defaults to 3000)

Starts the Express server

Logs a confirmation message when the server is running

​ Running the Application

Add the following scripts to your package.json :

{ "scripts" : { "start:backend" : "node backend/server.js" , "build" : "webpack --mode production" , "dev" : "concurrently \"npm run start:backend\" \"webpack serve --mode development\"" , "start" : "npm run build && npm run start:backend" } }

Start the development server:

npm run dev

Your application will now be running on:

​ Key Features

Microphone Permission Handling: The application requests and verifies microphone access before starting a conversation. Connection Status Display: Visual indicators show whether the application is connected and if the AI agent is speaking. Error Handling: The application includes comprehensive error handling for both the frontend and backend.

​ Important Notes

Make sure to keep your API key secure and never expose it in the frontend code

The application requires HTTPS in production for microphone access

Always handle errors appropriately to provide a good user experience

Test microphone permissions and audio playback thoroughly