ElevenLabs Text-to-Speech Integration¶

StarStreamer provides seamless integration with ElevenLabs for high-quality text-to-speech functionality. This integration allows streamers to add voice narration and interactive TTS commands to their streams.

Overview¶

The ElevenLabs integration is built on the official ElevenLabs Python SDK, providing reliable access to their AI voice generation service. The integration supports multiple voices, streaming audio generation, and is designed to be easily extensible for custom use cases.

Key Features¶

Official SDK Integration - Uses the official AsyncElevenLabs client for reliable API access
Multiple Voice Support - Access to all voices available in your ElevenLabs account
Streaming Support - Real-time audio streaming for longer texts
Module-Level Voice Management - Modules can create and manage their own voice configurations
Built-in Chat Commands - Ready-to-use TTS commands for streamers
Error Handling - Robust error handling with user-friendly error messages

Quick Start¶

1. Configuration¶

Add ElevenLabs configuration to your config.yaml:

elevenlabs:
  enabled: true
  api_key: "your_elevenlabs_api_key"

2. Get Your API Key¶

Sign up at ElevenLabs
Go to your Profile page
Copy your API key
Add it to your configuration

3. Start Using TTS¶

Once configured, the following commands are available in chat:

!tts Hello everyone, welcome to my stream!
!voices
!ttshelp

Chat Commands¶

!tts \<text>¶

Converts the provided text to speech using the default voice.

Usage:

!tts This is a test message
!tts Hello viewers, thanks for watching!

Features: - Maximum 500 character limit for safety - Automatic voice selection (uses first available voice) - User feedback with generated audio confirmation - Error handling for API failures

!voices¶

Lists the available voices from your ElevenLabs account.

Usage:

!voices

Output:

@username Available voices: Rachel, Drew, Bella, Josh, Arnold

Displays up to 5 voices. If more voices are available, shows a count of additional voices.

!ttshelp¶

Shows help information for TTS commands.

Usage:

!ttshelp

Output:

@username TTS Commands:
• !tts <text> - Convert text to speech (max 500 chars)
• !voices - List available voices
• !ttshelp - Show this help message

Developer Guide¶

ElevenLabsClient¶

The core client provides access to the ElevenLabs API through the official SDK.

Basic Usage¶

from starstreamer.plugins.elevenlabs import ElevenLabsClient, Voice

# Get client instance (singleton)
client = ElevenLabsClient.get_instance()

# Connect to API
await client.connect()

# Get available voices
voices = await client.get_voices_as_objects()

# Generate speech
voice = voices[0]  # Use first available voice
audio_bytes = await client.text_to_speech("Hello world", voice=voice)

Dependency Injection¶

The client integrates with StarStreamer's dependency injection system:

from starstreamer import on_event
from starstreamer.plugins.elevenlabs import ElevenLabsClient
from starstreamer.plugins.twitch import TwitchClient
from starstreamer.runtime.types import Event

@on_event("twitch.chat.message")
async def my_tts_handler(event: Event, elevenlabs: ElevenLabsClient, twitch: TwitchClient) -> None:
    # Client is automatically injected
    voices = await elevenlabs.get_voices_as_objects()
    if voices:
        audio = await elevenlabs.text_to_speech("Hello!", voice=voices[0])
        await twitch.send_message("TTS generated successfully!")

Voice Model¶

The Voice class represents an ElevenLabs voice with all its configuration.

Voice Properties¶

from starstreamer.plugins.elevenlabs import Voice

voice = Voice(
    voice_id="21m00Tcm4TlvDq8ikWAM",  # ElevenLabs voice ID
    model_id="eleven_multilingual_v2",  # Model to use
    name="Rachel",  # Human-readable name
    description="Pleasant female voice"  # Voice description
)

Creating Voices from API Response¶

# From ElevenLabs API response
api_response = {"voice_id": "abc123", "name": "Custom Voice"}
voice = Voice.from_api_response(api_response)

Module-Level Voice Management¶

Modules can create and manage their own voices for specialized use cases:

from modules.base import BaseModule
from starstreamer.plugins.elevenlabs import ElevenLabsClient, Voice

class RPGModule(BaseModule):
    def __init__(self):
        super().__init__()
        # Define module-specific voices
        self.narrator_voice = Voice(
            voice_id="21m00Tcm4TlvDq8ikWAM",
            model_id="eleven_multilingual_v2",
            name="RPG Narrator"
        )
        self.villain_voice = Voice(
            voice_id="AZnzlk1XvdvUeBnXmlld",
            model_id="eleven_multilingual_v2",
            name="Evil Villain"
        )

    async def narrate_story(self, text: str, elevenlabs: ElevenLabsClient):
        """Use the narrator voice for story elements"""
        audio = await elevenlabs.text_to_speech(text, voice=self.narrator_voice)
        return audio

    async def villain_speak(self, text: str, elevenlabs: ElevenLabsClient):
        """Use the villain voice for antagonist dialogue"""
        audio = await elevenlabs.text_to_speech(text, voice=self.villain_voice)
        return audio

Advanced Features¶

Streaming TTS¶

For longer texts, use the streaming API:

async def stream_long_text(text: str, voice: Voice, elevenlabs: ElevenLabsClient):
    """Stream longer text as audio chunks"""
    async for chunk in elevenlabs.text_to_speech_stream(text, voice=voice):
        # Process each audio chunk as it arrives
        process_audio_chunk(chunk)

Custom Voice Settings¶

from starstreamer.plugins.elevenlabs import Voice, VoiceSettings

# Create custom voice settings
settings = VoiceSettings(
    stability=0.7,
    similarity_boost=0.8,
    style=0.2,
    use_speaker_boost=True
)

voice = Voice(
    voice_id="voice_id_here",
    model_id="eleven_multilingual_v2",
    name="Custom Voice",
    settings=settings
)

Error Handling¶

The integration includes comprehensive error handling:

try:
    audio = await elevenlabs.text_to_speech("Hello", voice=voice)
except RuntimeError as e:
    logger.error(f"ElevenLabs client not connected: {e}")
except TypeError as e:
    logger.error(f"Voice parameter required: {e}")
except Exception as e:
    logger.error(f"TTS generation failed: {e}")

Integration Architecture¶

Component Overview¶

graph TD
    A[TTS Commands] --> B[ElevenLabsClient]
    B --> C[AsyncElevenLabs SDK]
    C --> D[ElevenLabs API]

    E[Custom Modules] --> B
    E --> F[Voice Objects]
    F --> B

    B --> G[Config Manager]
    G --> H[ElevenLabsConfig]

Data Flow¶

Configuration Loading - ElevenLabsClient loads config from ConfigManager
SDK Initialization - AsyncElevenLabs is initialized with API key
Voice Fetching - Available voices are fetched from API on connection
TTS Generation - Text is converted to speech using specified voice
Audio Delivery - Audio bytes are returned for processing

Dependencies¶

elevenlabs>=1.0.0 - Official ElevenLabs Python SDK
aiohttp>=3.9.0 - HTTP client for async operations
pydantic>=2.5.0 - Configuration validation

Best Practices¶

Voice Management¶

Cache Voice Objects - Store frequently used voices at module level
Use Descriptive Names - Give voices meaningful names for your use case
Handle Missing Voices - Always check if voices are available before use

# Good: Cache voices for reuse
class MyModule(BaseModule):
    def __init__(self):
        self.narrator = Voice("voice_id", "model_id", "Narrator")

    async def speak(self, text: str, elevenlabs: ElevenLabsClient):
        return await elevenlabs.text_to_speech(text, voice=self.narrator)

# Good: Check voice availability
voices = await elevenlabs.get_voices_as_objects()
if not voices:
    logger.warning("No voices available")
    return

Performance Optimization¶

Reuse Client Instance - Use the singleton pattern for the client
Limit Text Length - Keep TTS requests under 500 characters
Use Streaming - For longer texts, use the streaming API
Error Recovery - Implement retry logic for transient failures

Security Considerations¶

Protect API Keys - Never commit API keys to version control
Validate Input - Sanitize user input for TTS commands
Rate Limiting - Respect ElevenLabs API rate limits
Content Filtering - Filter inappropriate content before TTS

Troubleshooting¶

Common Issues¶

"ElevenLabs client not connected"¶

Cause: Client initialization failed or API key is invalid.

Solution: 1. Verify API key is correct 2. Check ElevenLabs account status 3. Ensure elevenlabs.enabled = true in config

"No voices available for TTS"¶

Cause: ElevenLabs account has no accessible voices.

Solution: 1. Check ElevenLabs subscription plan 2. Verify API key permissions 3. Check ElevenLabs service status

Rate Limiting Errors¶

Cause: Exceeded ElevenLabs API rate limits.

Solution: 1. Upgrade ElevenLabs subscription 2. Implement request queuing 3. Add cooldowns to TTS commands

Debug Mode¶

Enable debug logging for detailed troubleshooting:

logging:
  level: "DEBUG"

This will show detailed API interactions and error information.

API Reference¶

ElevenLabsClient¶

Methods¶

get_instance() -> ElevenLabsClient - Get singleton instance
connect() -> None - Initialize connection to ElevenLabs API
disconnect() -> None - Disconnect from API
text_to_speech(text: str, voice: Voice, output_format: str = "mp3_44100_128") -> bytes - Generate speech
text_to_speech_stream(text: str, voice: Voice) -> AsyncIterator[bytes] - Stream speech generation
get_voices() -> dict[str, Any] - Get raw voice data from API
get_voices_as_objects() -> list[Voice] - Get voices as Voice objects

Voice¶

Properties¶

voice_id: str - ElevenLabs voice identifier
model_id: str - Model to use for generation
name: str - Human-readable voice name
description: str - Voice description
settings: VoiceSettings - Voice generation settings

Methods¶

from_api_response(data: dict, model_id: str = "eleven_multilingual_v2") -> Voice - Create from API data
to_dict() -> dict[str, Any] - Convert to dictionary
__str__() -> str - String representation

ElevenLabs Text-to-Speech Integration¶

Overview¶

Key Features¶

Quick Start¶

1. Configuration¶

2. Get Your API Key¶

3. Start Using TTS¶

Chat Commands¶

!tts \<text>¶

!voices¶

!ttshelp¶

Developer Guide¶

ElevenLabsClient¶

Basic Usage¶

Dependency Injection¶

Voice Model¶

Voice Properties¶

Creating Voices from API Response¶

Module-Level Voice Management¶

Advanced Features¶

Streaming TTS¶

Custom Voice Settings¶

Error Handling¶

Integration Architecture¶

Component Overview¶

Data Flow¶

Dependencies¶

Best Practices¶

Voice Management¶

Performance Optimization¶

Security Considerations¶

Troubleshooting¶

Common Issues¶

"ElevenLabs client not connected"¶

"No voices available for TTS"¶

Rate Limiting Errors¶

Debug Mode¶

API Reference¶

ElevenLabsClient¶

Methods¶

Voice¶

Properties¶

Methods¶

VoiceSettings¶

Properties¶

Next Steps¶