For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Connect
BlogHelp CenterAPI PricingSign up
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
  • Get started
    • Quickstart
    • Agents Quickstart
    • Choosing the right model
  • Tutorials
    • Text to Speech
    • Speech to Text
    • Speech Engine
    • Music
    • Text to Dialogue
    • Voice Changer
    • Voice Isolator
    • Dubbing
    • Sound effects
    • Forced Alignment
  • Concepts
    • Understanding audio streaming
    • Understanding latency
    • Voice cloning
  • How-to guides
        • Multichannel transcription
        • Webhooks
        • Keyterm prompting
        • Entity detection
        • Telegram bot
        • Vercel AI SDK
  • Reference
    • Libraries & SDKs
    • Errors
    • Agent tooling
    • Webhooks
    • Zero Retention Mode
    • Breaking changes policy
    • UI components
    • Example projects
    • Next.js template
    • Showcase
  • Private deployment
    • Overview
LogoLogo
Login
Login
Connect
BlogHelp CenterAPI PricingSign up
On this page
  • Introduction
  • Requirements
  • Setup
  • Register a Telegram bot
  • Create a Supabase project locally
  • Create a database table to log the transcription results
  • Create a Supabase Edge Function to handle Telegram webhook requests
  • Set up the environment variables
  • Dependencies
  • Code the Telegram Bot
  • Code deep dive
  • Deploy to Supabase
  • Apply the database migrations
  • Set up the webhook
  • Set the function secrets
  • Test the bot
  • Next steps
How-to guidesSpeech to TextBatch

Transcription Telegram Bot

Build a Telegram bot that transcribes audio and video messages in 90+languages using TypeScript with Deno in Supabase Edge Functions.

Was this page helpful?
Previous

Vercel AI SDK

Use the ElevenLabs Provider in the Vercel AI SDK to transcribe speech from audio and video files.
Next
Built with

How-to guide ยท Assumes you have completed the Speech to Text quickstart and have a Telegram bot token and Supabase account.

Introduction

In this tutorial you will learn how to build a Telegram bot that transcribes audio and video messages in 90+ languages using TypeScript and the ElevenLabs Scribe model via the speech-to-text API.

Requirements

  • An ElevenLabs account with an API key.
  • A Supabase account (you can sign up for a free account via database.new).
  • The Supabase CLI installed on your machine.
  • The Deno runtime installed on your machine and optionally setup in your facourite IDE.
  • A Telegram account.

Setup

Register a Telegram bot

Use the BotFather to create a new Telegram bot. Run the /newbot command and follow the instructions to create a new bot. At the end, you will receive your secret bot token. Note it down securely for the next step.

BotFather

Create a Supabase project locally

After installing the Supabase CLI, run the following command to create a new Supabase project locally:

$supabase init

Create a database table to log the transcription results

Next, create a new database table to log the transcription results:

$supabase migrations new init

This will create a new migration file in the supabase/migrations directory. Open the file and add the following SQL:

supabase/migrations/init.sql
1CREATE TABLE IF NOT EXISTS transcription_logs (
2 id BIGSERIAL PRIMARY KEY,
3 file_type VARCHAR NOT NULL,
4 duration INTEGER NOT NULL,
5 chat_id BIGINT NOT NULL,
6 message_id BIGINT NOT NULL,
7 username VARCHAR,
8 transcript TEXT,
9 language_code VARCHAR,
10 created_at TIMESTAMP WITH TIME ZONE DEFAULT CURRENT_TIMESTAMP,
11 error TEXT
12);
13
14ALTER TABLE transcription_logs ENABLE ROW LEVEL SECURITY;

Create a Supabase Edge Function to handle Telegram webhook requests

Next, create a new Edge Function to handle Telegram webhook requests:

$supabase functions new scribe-bot

If youโ€™re using VS Code or Cursor, select y when the CLI prompts โ€œGenerate VS Code settings for Deno? [y/N]โ€!

Set up the environment variables

Within the supabase/functions directory, create a new .env file and add the following variables:

supabase/functions/.env
1# Find / create an API key at https://elevenlabs.io/app/settings/api-keys
2ELEVENLABS_API_KEY=your_api_key
3
4# The bot token you received from the BotFather.
5TELEGRAM_BOT_TOKEN=your_bot_token
6
7# A random secret chosen by you to secure the function.
8FUNCTION_SECRET=random_secret

Dependencies

The project uses a couple of dependencies:

  • The open-source grammY Framework to handle the Telegram webhook requests.
  • The @supabase/supabase-js library to interact with the Supabase database.
  • The ElevenLabs JavaScript SDK to interact with the speech-to-text API.

Since Supabase Edge Function uses the Deno runtime, you donโ€™t need to install the dependencies, rather you can import them via the npm: prefix.

Code the Telegram Bot

In your newly created scribe-bot/index.ts file, add the following code:

supabase/functions/scribe-bot/index.ts
1import { Bot, webhookCallback } from "https://deno.land/x/grammy@v1.34.0/mod.ts";
2import "jsr:@supabase/functions-js/edge-runtime.d.ts";
3import { createClient } from "jsr:@supabase/supabase-js@2";
4import { ElevenLabsClient } from "npm:elevenlabs@1.50.5";
5
6console.log(`Function "elevenlabs-scribe-bot" up and running!`);
7
8const elevenlabs = new ElevenLabsClient({
9 apiKey: Deno.env.get("ELEVENLABS_API_KEY") || "",
10});
11
12const supabase = createClient(
13 Deno.env.get("SUPABASE_URL") || "",
14 Deno.env.get("SUPABASE_SERVICE_ROLE_KEY") || ""
15);
16
17async function scribe({
18 fileURL,
19 fileType,
20 duration,
21 chatId,
22 messageId,
23 username,
24}: {
25 fileURL: string;
26 fileType: string;
27 duration: number;
28 chatId: number;
29 messageId: number;
30 username: string;
31}) {
32 let transcript: string | null = null;
33 let languageCode: string | null = null;
34 let errorMsg: string | null = null;
35 try {
36 const sourceFileArrayBuffer = await fetch(fileURL).then((res) => res.arrayBuffer());
37 const sourceBlob = new Blob([sourceFileArrayBuffer], {
38 type: fileType,
39 });
40
41 const scribeResult = await elevenlabs.speechToText.convert({
42 file: sourceBlob,
43 model_id: "scribe_v2",
44 tag_audio_events: false,
45 });
46
47 transcript = scribeResult.text;
48 languageCode = scribeResult.language_code;
49
50 // Reply to the user with the transcript
51 await bot.api.sendMessage(chatId, transcript, {
52 reply_parameters: { message_id: messageId },
53 });
54 } catch (error) {
55 errorMsg = error.message;
56 console.log(errorMsg);
57 await bot.api.sendMessage(chatId, "Sorry, there was an error. Please try again.", {
58 reply_parameters: { message_id: messageId },
59 });
60 }
61 // Write log to Supabase.
62 const logLine = {
63 file_type: fileType,
64 duration,
65 chat_id: chatId,
66 message_id: messageId,
67 username,
68 language_code: languageCode,
69 error: errorMsg,
70 };
71 console.log({ logLine });
72 await supabase.from("transcription_logs").insert({ ...logLine, transcript });
73}
74
75const telegramBotToken = Deno.env.get("TELEGRAM_BOT_TOKEN");
76const bot = new Bot(telegramBotToken || "");
77const startMessage = `Welcome to the ElevenLabs Scribe Bot\\! I can transcribe speech in 90\\+ languages with super high accuracy\\!
78 \nTry it out by sending or forwarding me a voice message, video, or audio file\\!
79 \n[Learn more about Scribe](https://elevenlabs.io/speech-to-text) or [build your own bot](https://elevenlabs.io/developers/guides/cookbooks/speech-to-text/telegram-bot)\\!
80 `;
81bot.command("start", (ctx) => ctx.reply(startMessage.trim(), { parse_mode: "MarkdownV2" }));
82
83bot.on([":voice", ":audio", ":video"], async (ctx) => {
84 try {
85 const file = await ctx.getFile();
86 const fileURL = `https://api.telegram.org/file/bot${telegramBotToken}/${file.file_path}`;
87 const fileMeta = ctx.message?.video ?? ctx.message?.voice ?? ctx.message?.audio;
88
89 if (!fileMeta) {
90 return ctx.reply("No video|audio|voice metadata found. Please try again.");
91 }
92
93 // Run the transcription in the background.
94 EdgeRuntime.waitUntil(
95 scribe({
96 fileURL,
97 fileType: fileMeta.mime_type!,
98 duration: fileMeta.duration,
99 chatId: ctx.chat.id,
100 messageId: ctx.message?.message_id!,
101 username: ctx.from?.username || "",
102 })
103 );
104
105 // Reply to the user immediately to let them know we received their file.
106 return ctx.reply("Received. Scribing...");
107 } catch (error) {
108 console.error(error);
109 return ctx.reply(
110 "Sorry, there was an error getting the file. Please try again with a smaller file!"
111 );
112 }
113});
114
115const handleUpdate = webhookCallback(bot, "std/http");
116
117Deno.serve(async (req) => {
118 try {
119 const url = new URL(req.url);
120 if (url.searchParams.get("secret") !== Deno.env.get("FUNCTION_SECRET")) {
121 return new Response("not allowed", { status: 405 });
122 }
123
124 return await handleUpdate(req);
125 } catch (err) {
126 console.error(err);
127 }
128});

Code deep dive

Thereโ€™s a couple of things worth noting about the code. Letโ€™s step through it step by step.

1

Handling the incoming request

To handle the incoming request, use the Deno.serve handler. The handler checks whether the request has the correct secret and then passes the request to the handleUpdate function.

1const handleUpdate = webhookCallback(bot, 'std/http');
2
3Deno.serve(async (req) => {
4 try {
5 const url = new URL(req.url);
6 if (url.searchParams.get('secret') !== Deno.env.get('FUNCTION_SECRET')) {
7 return new Response('not allowed', { status: 405 });
8 }
9
10 return await handleUpdate(req);
11 } catch (err) {
12 console.error(err);
13 }
14});
2

Handle voice, audio, and video messages

The grammY frameworks provides a convenient way to filter for specific message types. In this case, the bot is listening for voice, audio, and video messages.

Using the request context, the bot extracts the file metadata and then uses Supabase Background Tasks EdgeRuntime.waitUntil to run the transcription in the background.

This way you can provide an immediate response to the user and handle the transcription of the file in the background.

1bot.on([':voice', ':audio', ':video'], async (ctx) => {
2 try {
3 const file = await ctx.getFile();
4 const fileURL = `https://api.telegram.org/file/bot${telegramBotToken}/${file.file_path}`;
5 const fileMeta = ctx.message?.video ?? ctx.message?.voice ?? ctx.message?.audio;
6
7 if (!fileMeta) {
8 return ctx.reply('No video|audio|voice metadata found. Please try again.');
9 }
10
11 // Run the transcription in the background.
12 EdgeRuntime.waitUntil(
13 scribe({
14 fileURL,
15 fileType: fileMeta.mime_type!,
16 duration: fileMeta.duration,
17 chatId: ctx.chat.id,
18 messageId: ctx.message?.message_id!,
19 username: ctx.from?.username || '',
20 })
21 );
22
23 // Reply to the user immediately to let them know we received their file.
24 return ctx.reply('Received. Scribing...');
25 } catch (error) {
26 console.error(error);
27 return ctx.reply(
28 'Sorry, there was an error getting the file. Please try again with a smaller file!'
29 );
30 }
31});
3

Transcription with the ElevenLabs API

Finally, in the background worker, the bot uses the ElevenLabs JavaScript SDK to transcribe the file. Once the transcription is complete, the bot replies to the user with the transcript and writes a log entry to the Supabase database using supabase-js.

1const elevenlabs = new ElevenLabsClient({
2 apiKey: Deno.env.get('ELEVENLABS_API_KEY') || '',
3});
4
5const supabase = createClient(
6 Deno.env.get('SUPABASE_URL') || '',
7 Deno.env.get('SUPABASE_SERVICE_ROLE_KEY') || ''
8);
9
10async function scribe({
11 fileURL,
12 fileType,
13 duration,
14 chatId,
15 messageId,
16 username,
17}: {
18 fileURL: string;
19 fileType: string;
20 duration: number;
21 chatId: number;
22 messageId: number;
23 username: string;
24}) {
25 let transcript: string | null = null;
26 let languageCode: string | null = null;
27 let errorMsg: string | null = null;
28 try {
29 const sourceFileArrayBuffer = await fetch(fileURL).then((res) => res.arrayBuffer());
30 const sourceBlob = new Blob([sourceFileArrayBuffer], {
31 type: fileType,
32 });
33
34 const scribeResult = await elevenlabs.speechToText.convert({
35 file: sourceBlob,
36 model_id: 'scribe_v2',
37 tag_audio_events: false,
38 });
39
40 transcript = scribeResult.text;
41 languageCode = scribeResult.language_code;
42
43 // Reply to the user with the transcript
44 await bot.api.sendMessage(chatId, transcript, {
45 reply_parameters: { message_id: messageId },
46 });
47 } catch (error) {
48 errorMsg = error.message;
49 console.log(errorMsg);
50 await bot.api.sendMessage(chatId, 'Sorry, there was an error. Please try again.', {
51 reply_parameters: { message_id: messageId },
52 });
53 }
54 // Write log to Supabase.
55 const logLine = {
56 file_type: fileType,
57 duration,
58 chat_id: chatId,
59 message_id: messageId,
60 username,
61 language_code: languageCode,
62 error: errorMsg,
63 };
64 console.log({ logLine });
65 await supabase.from('transcription_logs').insert({ ...logLine, transcript });
66}

Deploy to Supabase

If you havenโ€™t already, create a new Supabase account at database.new and link the local project to your Supabase account:

$supabase link

Apply the database migrations

Run the following command to apply the database migrations from the supabase/migrations directory:

$supabase db push

Navigate to the table editor in your Supabase dashboard and you should see and empty transcription_logs table.

Empty table

Lastly, run the following command to deploy the Edge Function:

$supabase functions deploy --no-verify-jwt scribe-bot

Navigate to the Edge Functions view in your Supabase dashboard and you should see the scribe-bot function deployed. Make a note of the function URL as youโ€™ll need it later, it should look something like https://<project-ref>.functions.supabase.co/scribe-bot.

Edge Function deployed

Set up the webhook

Set your botโ€™s webhook url to https://<PROJECT_REFERENCE>.functions.supabase.co/telegram-bot (Replacing <...> with respective values). In order to do that, simply run a GET request to the following url (in your browser, for example):

https://api.telegram.org/bot<TELEGRAM_BOT_TOKEN>/setWebhook?url=https://<PROJECT_REFERENCE>.supabase.co/functions/v1/scribe-bot?secret=<FUNCTION_SECRET>

Note that the FUNCTION_SECRET is the secret you set in your .env file.

Set webhook

Set the function secrets

Now that you have all your secrets set locally, you can run the following command to set the secrets in your Supabase project:

$supabase secrets set --env-file supabase/functions/.env

Test the bot

Finally you can test the bot by sending it a voice message, audio or video file.

Test the bot

After you see the transcript as a reply, navigate back to your table editor in the Supabase dashboard and you should see a new row in your transcription_logs table.

New row in table

Next steps

API reference

Full Speech to Text API reference and parameters.

Twilio integration

Integrate ElevenLabs TTS with Twilio for phone-based voice applications.