Skip to content

ElevenLabs Text-to-Speech Integration

StarStreamer provides seamless integration with ElevenLabs for high-quality text-to-speech functionality. This integration allows streamers to add voice narration and interactive TTS commands to their streams.

Overview

The ElevenLabs integration is built on the official ElevenLabs Python SDK, providing reliable access to their AI voice generation service. The integration supports multiple voices, streaming audio generation, and is designed to be easily extensible for custom use cases.

Key Features

  • Official SDK Integration - Uses the official AsyncElevenLabs client for reliable API access
  • Multiple Voice Support - Access to all voices available in your ElevenLabs account
  • Streaming Support - Real-time audio streaming for longer texts
  • Module-Level Voice Management - Modules can create and manage their own voice configurations
  • Built-in Chat Commands - Ready-to-use TTS commands for streamers
  • Error Handling - Robust error handling with user-friendly error messages

Quick Start

1. Configuration

Add ElevenLabs configuration to your config.yaml:

elevenlabs:
  enabled: true
  api_key: "your_elevenlabs_api_key"

2. Get Your API Key

  1. Sign up at ElevenLabs
  2. Go to your Profile page
  3. Copy your API key
  4. Add it to your configuration

3. Start Using TTS

Once configured, the following commands are available in chat:

!tts Hello everyone, welcome to my stream!
!voices
!ttshelp

Chat Commands

!tts \<text>

Converts the provided text to speech using the default voice.

Usage:

!tts This is a test message
!tts Hello viewers, thanks for watching!

Features: - Maximum 500 character limit for safety - Automatic voice selection (uses first available voice) - User feedback with generated audio confirmation - Error handling for API failures

!voices

Lists the available voices from your ElevenLabs account.

Usage:

!voices

Output:

@username Available voices: Rachel, Drew, Bella, Josh, Arnold

Displays up to 5 voices. If more voices are available, shows a count of additional voices.

!ttshelp

Shows help information for TTS commands.

Usage:

!ttshelp

Output:

@username TTS Commands:
• !tts <text> - Convert text to speech (max 500 chars)
• !voices - List available voices
• !ttshelp - Show this help message

Developer Guide

ElevenLabsClient

The core client provides access to the ElevenLabs API through the official SDK.

Basic Usage

from starstreamer.plugins.elevenlabs import ElevenLabsClient, Voice

# Get client instance (singleton)
client = ElevenLabsClient.get_instance()

# Connect to API
await client.connect()

# Get available voices
voices = await client.get_voices_as_objects()

# Generate speech
voice = voices[0]  # Use first available voice
audio_bytes = await client.text_to_speech("Hello world", voice=voice)

Dependency Injection

The client integrates with StarStreamer's dependency injection system:

from starstreamer import on_event
from starstreamer.plugins.elevenlabs import ElevenLabsClient
from starstreamer.plugins.twitch import TwitchClient
from starstreamer.runtime.types import Event

@on_event("twitch.chat.message")
async def my_tts_handler(event: Event, elevenlabs: ElevenLabsClient, twitch: TwitchClient) -> None:
    # Client is automatically injected
    voices = await elevenlabs.get_voices_as_objects()
    if voices:
        audio = await elevenlabs.text_to_speech("Hello!", voice=voices[0])
        await twitch.send_message("TTS generated successfully!")

Voice Model

The Voice class represents an ElevenLabs voice with all its configuration.

Voice Properties

from starstreamer.plugins.elevenlabs import Voice

voice = Voice(
    voice_id="21m00Tcm4TlvDq8ikWAM",  # ElevenLabs voice ID
    model_id="eleven_multilingual_v2",  # Model to use
    name="Rachel",  # Human-readable name
    description="Pleasant female voice"  # Voice description
)

Creating Voices from API Response

# From ElevenLabs API response
api_response = {"voice_id": "abc123", "name": "Custom Voice"}
voice = Voice.from_api_response(api_response)

Module-Level Voice Management

Modules can create and manage their own voices for specialized use cases:

from modules.base import BaseModule
from starstreamer.plugins.elevenlabs import ElevenLabsClient, Voice

class RPGModule(BaseModule):
    def __init__(self):
        super().__init__()
        # Define module-specific voices
        self.narrator_voice = Voice(
            voice_id="21m00Tcm4TlvDq8ikWAM",
            model_id="eleven_multilingual_v2",
            name="RPG Narrator"
        )
        self.villain_voice = Voice(
            voice_id="AZnzlk1XvdvUeBnXmlld",
            model_id="eleven_multilingual_v2",
            name="Evil Villain"
        )

    async def narrate_story(self, text: str, elevenlabs: ElevenLabsClient):
        """Use the narrator voice for story elements"""
        audio = await elevenlabs.text_to_speech(text, voice=self.narrator_voice)
        return audio

    async def villain_speak(self, text: str, elevenlabs: ElevenLabsClient):
        """Use the villain voice for antagonist dialogue"""
        audio = await elevenlabs.text_to_speech(text, voice=self.villain_voice)
        return audio

Advanced Features

Streaming TTS

For longer texts, use the streaming API:

async def stream_long_text(text: str, voice: Voice, elevenlabs: ElevenLabsClient):
    """Stream longer text as audio chunks"""
    async for chunk in elevenlabs.text_to_speech_stream(text, voice=voice):
        # Process each audio chunk as it arrives
        process_audio_chunk(chunk)

Custom Voice Settings

from starstreamer.plugins.elevenlabs import Voice, VoiceSettings

# Create custom voice settings
settings = VoiceSettings(
    stability=0.7,
    similarity_boost=0.8,
    style=0.2,
    use_speaker_boost=True
)

voice = Voice(
    voice_id="voice_id_here",
    model_id="eleven_multilingual_v2",
    name="Custom Voice",
    settings=settings
)

Error Handling

The integration includes comprehensive error handling:

try:
    audio = await elevenlabs.text_to_speech("Hello", voice=voice)
except RuntimeError as e:
    logger.error(f"ElevenLabs client not connected: {e}")
except TypeError as e:
    logger.error(f"Voice parameter required: {e}")
except Exception as e:
    logger.error(f"TTS generation failed: {e}")

Integration Architecture

Component Overview

graph TD
    A[TTS Commands] --> B[ElevenLabsClient]
    B --> C[AsyncElevenLabs SDK]
    C --> D[ElevenLabs API]

    E[Custom Modules] --> B
    E --> F[Voice Objects]
    F --> B

    B --> G[Config Manager]
    G --> H[ElevenLabsConfig]

Data Flow

  1. Configuration Loading - ElevenLabsClient loads config from ConfigManager
  2. SDK Initialization - AsyncElevenLabs is initialized with API key
  3. Voice Fetching - Available voices are fetched from API on connection
  4. TTS Generation - Text is converted to speech using specified voice
  5. Audio Delivery - Audio bytes are returned for processing

Dependencies

  • elevenlabs>=1.0.0 - Official ElevenLabs Python SDK
  • aiohttp>=3.9.0 - HTTP client for async operations
  • pydantic>=2.5.0 - Configuration validation

Best Practices

Voice Management

  1. Cache Voice Objects - Store frequently used voices at module level
  2. Use Descriptive Names - Give voices meaningful names for your use case
  3. Handle Missing Voices - Always check if voices are available before use
# Good: Cache voices for reuse
class MyModule(BaseModule):
    def __init__(self):
        self.narrator = Voice("voice_id", "model_id", "Narrator")

    async def speak(self, text: str, elevenlabs: ElevenLabsClient):
        return await elevenlabs.text_to_speech(text, voice=self.narrator)

# Good: Check voice availability
voices = await elevenlabs.get_voices_as_objects()
if not voices:
    logger.warning("No voices available")
    return

Performance Optimization

  1. Reuse Client Instance - Use the singleton pattern for the client
  2. Limit Text Length - Keep TTS requests under 500 characters
  3. Use Streaming - For longer texts, use the streaming API
  4. Error Recovery - Implement retry logic for transient failures

Security Considerations

  1. Protect API Keys - Never commit API keys to version control
  2. Validate Input - Sanitize user input for TTS commands
  3. Rate Limiting - Respect ElevenLabs API rate limits
  4. Content Filtering - Filter inappropriate content before TTS

Troubleshooting

Common Issues

"ElevenLabs client not connected"

Cause: Client initialization failed or API key is invalid.

Solution: 1. Verify API key is correct 2. Check ElevenLabs account status 3. Ensure elevenlabs.enabled = true in config

"No voices available for TTS"

Cause: ElevenLabs account has no accessible voices.

Solution: 1. Check ElevenLabs subscription plan 2. Verify API key permissions 3. Check ElevenLabs service status

Rate Limiting Errors

Cause: Exceeded ElevenLabs API rate limits.

Solution: 1. Upgrade ElevenLabs subscription 2. Implement request queuing 3. Add cooldowns to TTS commands

Debug Mode

Enable debug logging for detailed troubleshooting:

logging:
  level: "DEBUG"

This will show detailed API interactions and error information.

API Reference

ElevenLabsClient

Methods

  • get_instance() -> ElevenLabsClient - Get singleton instance
  • connect() -> None - Initialize connection to ElevenLabs API
  • disconnect() -> None - Disconnect from API
  • text_to_speech(text: str, voice: Voice, output_format: str = "mp3_44100_128") -> bytes - Generate speech
  • text_to_speech_stream(text: str, voice: Voice) -> AsyncIterator[bytes] - Stream speech generation
  • get_voices() -> dict[str, Any] - Get raw voice data from API
  • get_voices_as_objects() -> list[Voice] - Get voices as Voice objects

Voice

Properties

  • voice_id: str - ElevenLabs voice identifier
  • model_id: str - Model to use for generation
  • name: str - Human-readable voice name
  • description: str - Voice description
  • settings: VoiceSettings - Voice generation settings

Methods

  • from_api_response(data: dict, model_id: str = "eleven_multilingual_v2") -> Voice - Create from API data
  • to_dict() -> dict[str, Any] - Convert to dictionary
  • __str__() -> str - String representation

VoiceSettings

Properties

  • stability: float - Voice stability (0.0-1.0)
  • similarity_boost: float - Similarity boost (0.0-1.0)
  • style: float - Style setting (0.0-1.0)
  • use_speaker_boost: bool - Enable speaker boost

Next Steps