Kotlin SDK | ElevenLabs Documentation

Refer to the Conversational AI overview for an explanation of how Conversational AI works.

Installation

Add the ElevenLabs SDK to your Android project by including the following dependency in your app-level build.gradle file:

build.gradle.kts

1 dependencies {
2     // ElevenLabs Conversational AI SDK (Android)
3     implementation("io.elevenlabs:elevenlabs-android:<latest>")
4 
5     // Kotlin coroutines, AndroidX, etc., as needed by your app
6 }

An example Android app using this SDK can be found here

Requirements

Android API level 21 (Android 5.0) or higher
Internet permission for API calls
Microphone permission for voice input
Network security configuration for HTTPS calls

Setup

Manifest Configuration

Add the necessary permissions to your AndroidManifest.xml:

1 <uses-permission android:name="android.permission.INTERNET" />
2 <uses-permission android:name="android.permission.RECORD_AUDIO" />
3 <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />

Runtime Permissions

For Android 6.0 (API level 23) and higher, you must request microphone permission at runtime:

1 import android.Manifest
2 import android.content.pm.PackageManager
3 import androidx.core.app.ActivityCompat
4 import androidx.core.content.ContextCompat
5 
6 private fun requestMicrophonePermission() {
7     if (ContextCompat.checkSelfPermission(this, Manifest.permission.RECORD_AUDIO)
8         != PackageManager.PERMISSION_GRANTED) {
9 
10         if (ActivityCompat.shouldShowRequestPermissionRationale(this, Manifest.permission.RECORD_AUDIO)) {
11             // Show explanation to the user
12             showPermissionExplanationDialog()
13         } else {
14             ActivityCompat.requestPermissions(
15                 this,
16                 arrayOf(Manifest.permission.RECORD_AUDIO),
17                 MICROPHONE_PERMISSION_REQUEST_CODE
18             )
19         }
20     }
21 }

Network Security Configuration

For apps targeting Android 9 (API level 28) or higher, ensure your network security configuration allows clear text traffic if needed:

1 <!-- In AndroidManifest.xml -->
2 <application
3     android:networkSecurityConfig="@xml/network_security_config"
4     ... >

1 <!-- res/xml/network_security_config.xml -->
2 <?xml version="1.0" encoding="utf-8"?>
3 <network-security-config>
4     <domain-config cleartextTrafficPermitted="true">
5         <domain includeSubdomains="true">your-api-domain.com</domain>
6     </domain-config>
7 </network-security-config>

Usage

Initialize the ElevenLabs SDK in your Application class or main activity:

Start a conversation session with either:

Public agent: pass agentId
Private agent: pass conversationToken provisioned from your backend (never expose your API key to the client).

1 import io.elevenlabs.ConversationClient
2 import io.elevenlabs.ConversationConfig
3 import io.elevenlabs.ConversationSession
4 
5 // Start a public agent session (token generated for you)
6 val config = ConversationConfig(
7     agentId = "<your_public_agent_id>", // OR conversationToken = "<token>"
8 )
9 
10 // In an Activity context
11 val session: ConversationSession = ConversationClient.startSession(config, this)

Note that Conversational AI requires microphone access. Consider explaining and requesting permissions in your app’s UI before the conversation starts, especially on Android 6.0+ where runtime permissions are required.

Callbacks

The ConversationConfig can be configured with callbacks to handle conversation events:

1 val config = ConversationConfig(
2     agentId = "<your_public_agent_id>", // OR conversationToken = "<token>"
3     userId = "your-user-id",
4     // Optional callbacks
5     onConnect = { conversationId ->
6         // Called when the conversation is connected and returns the conversation ID. You can access conversationId via session.getId() too
7     },
8     onMessage = { source, messageJson ->
9         // Raw JSON messages from data channel; useful for logging/telemetry
10     },
11     onModeChange = { mode ->
12         // "speaking" | "listening" — drive UI indicators
13     },
14     onStatusChange = { status ->
15         // "connected" | "connecting" | "disconnected"
16     },
17     onCanSendFeedbackChange = { canSend ->
18         // Enable/disable thumbs up/down buttons for feedback reporting
19     },
20     onUnhandledClientToolCall = { call ->
21         // Agent requested a client tool not registered on the device
22     },
23     onVadScore = { score ->
24         // Voice Activity Detection score, range from 0 to 1 where higher values indicate higher confidence of speech
25     }
26     // List of client tools the agent can invoke
27     clientTools = mapOf(
28         // Example of a client tool that logs a message to the console
29         // Remember that client tools need to be registered on the agent in the ElevenLabs dashboard
30         "logMessage" to object : ClientTool {
31             override suspend fun execute(parameters: Map<String, Any>): ClientToolResult {
32                 val message = parameters["message"] as? String
33 
34                 Log.d("ExampleApp", "[INFO] Client Tool Log: $message")
35                 return ClientToolResult.success("Message logged successfully")
36             }
37         }
38     ),
39 )

onConnect - Called when the WebRTC connection is established.
onMessage - Called when a new message is received. These can be tentative or final transcriptions of user voice, replies produced by LLM, or debug messages.
onModeChange - Called when the conversation mode changes. This is useful for indicating whether the agent is speaking or listening.
onStatusChange - Called when the conversation status changes.
onCanSendFeedbackChange - Called when the ability to send feedback changes.
onUnhandledClientToolCall - Called when the agent requests a client tool that is not registered on the device.
onVadScore - Called when the voice activity detection score changes.

Not all client events are enabled by default for an agent. If you have enabled a callback but aren’t seeing events come through, ensure that your Conversational AI agent has the corresponding event enabled. You can do this in the “Advanced” tab of the agent settings in the ElevenLabs dashboard.

Methods

startSession

The startSession method initiates the WebRTC connection and starts using the microphone to communicate with the ElevenLabs Conversational AI agent.

Public agents

For public agents (i.e. agents that don’t have authentication enabled), only the agentId is required. The Agent ID can be acquired through the ElevenLabs UI.

1 val session = ConversationClient.startSession(
2     config = ConversationConfig(
3         agentId = "your-agent-id"
4     ),
5     context = this
6 )

Private agents

For private agents, you must pass in a conversationToken obtained from the ElevenLabs API. Generating this token requires an ElevenLabs API key.

The conversationToken is valid for 10 minutes.

1 // Server-side token generation (Node.js example)
2 
3 app.get('/conversation-token', yourAuthMiddleware, async (req, res) => {
4   const response = await fetch(
5     `https://api.elevenlabs.io/v1/convai/conversation/token?agent_id=${process.env.AGENT_ID}`,
6     {
7       headers: {
8         // Requesting a conversation token requires your ElevenLabs API key
9         // Do NOT expose your API key to the client!
10         'xi-api-key': process.env.ELEVENLABS_API_KEY,
11       },
12     }
13   );
14 
15   if (!response.ok) {
16     return res.status(500).send('Failed to get conversation token');
17   }
18 
19   const body = await response.json();
20   res.send(body.token);
21 });

Then, pass the token to the startSession method. Note that only the conversationToken is required for private agents.

1 // Get conversation token from your server
2 val conversationToken = fetchConversationTokenFromServer()
3 
4 // For private agents, pass in the conversation token
5 val session = ConversationClient.startSession(
6     config = ConversationConfig(
7         conversationToken = conversationToken
8     ),
9     context = this
10 )

You can optionally pass a user ID to identify the user in the conversation. This can be your own customer identifier. This will be included in the conversation initiation data sent to the server.

1 val session = ConversationClient.startSession(
2     config = ConversationConfig(
3         agentId = "your-agent-id",
4         userId = "your-user-id"
5     ),
6     context = this
7 )

endSession

A method to manually end the conversation. The method will disconnect and end the conversation.

1 session.endSession()

sendUserMessage

Send a text message to the agent during an active conversation. This will trigger a response from the agent.

1 session.sendUserMessage("Hello, how can you help me?")

sendContextualUpdate

Sends contextual information to the agent that won’t trigger a response.

1 session.sendContextualUpdate(
2     "User navigated to the profile page. Consider this for next response."
3 )

sendFeedback

Provide feedback on the conversation quality. This helps improve the agent’s performance.

1 // Positive feedback
2 session.sendFeedback(true)
3 
4 // Negative feedback
5 session.sendFeedback(false)

sendUserActivity

Notifies the agent about user activity to prevent interruptions. Useful for when the user is actively using the app and the agent should pause speaking, i.e. when the user is typing in a chat.

The agent will pause speaking for ~2 seconds after receiving this signal.

1 session.sendUserActivity()

Mute/ Unmute

1 session.toggleMute()
2 session.setMicMuted(true)   // mute
3 session.setMicMuted(false)  // unmute

Observe session.isMuted to update the UI label between “Mute” and “Unmute”.

Properties

status

Get the current status of the conversation.

1 val status = session.status
2 Log.d("Conversation", "Current status: $status")
3 // Values: DISCONNECTED, CONNECTING, CONNECTED

Example Implementation

For an example implementation, see the example app in the ElevenLabs Android SDK repository.