AI Integration with OpenRouter¶

StarStreamer provides AI-powered chat interaction through the integration of LiteLLM with OpenRouter, enabling streamers to add intelligent conversation capabilities to their streams. This integration provides access to multiple AI models through a unified interface.

Overview¶

The AI integration is built on LiteLLM, which provides a unified interface to multiple AI providers, specifically configured to use OpenRouter as the primary provider. OpenRouter offers access to various state-of-the-art AI models including Claude, GPT-4, Llama, and many others.

Key Features¶

Multi-Model Support - Access to various AI models through OpenRouter (Claude, GPT-4, Llama, etc.)
Unified LiteLLM Interface - Consistent API regardless of the underlying model
Built-in Chat Commands - Ready-to-use AI commands for streamers
Singleton Client Pattern - Efficient resource management and configuration
Dependency Injection - Seamless integration with StarStreamer's DI system
Error Handling - Robust error handling with user-friendly messages

Quick Start¶

1. Configuration¶

Add AI configuration to your config.yaml:

# AI Integration using LiteLLM with OpenRouter
# Get your API key from https://openrouter.ai/keys
# Models are specified at the module level, not in configuration
ai:
  openrouter:
    enabled: false                         # Set to true to enable OpenRouter integration
    api_key: "${OPENROUTER_API_KEY}"      # OpenRouter API key (set via environment variable)

2. Get Your OpenRouter API Key¶

Sign up at OpenRouter
Go to API Keys
Create a new API key
Set the OPENROUTER_API_KEY environment variable

3. Start Using AI Commands¶

Once configured, the following commands are available in chat:

!ask How does machine learning work?
!explain quantum computing
!tldr Machine learning is a subset of artificial intelligence...

Chat Commands¶

!ask \<question>¶

Asks the AI a question and returns a conversational response.

Usage:

!ask What is the best programming language for beginners?
!ask How do I improve at streaming?
!ask Can you explain blockchain in simple terms?

Features: - Conversational AI responses tailored for Twitch chat - Concise answers optimized for stream interaction - Context-aware responses based on the configured system prompt - Error handling for API failures

!explain \<topic>¶

Requests an educational explanation of a topic or concept.

Usage:

!explain artificial intelligence
!explain how photosynthesis works
!explain the stock market

Features: - Educational focus with simple, accessible explanations - Optimized for general audiences - Concise responses suitable for chat interaction - Covers a wide range of topics

!tldr \<text>¶

Provides a concise summary (TL;DR) of the provided text.

Usage:

!tldr The quick brown fox jumps over the lazy dog. This pangram contains every letter of the English alphabet at least once and is commonly used for testing typefaces and keyboards.
!tldr <long article or explanation>

Features: - Text summarization capabilities - Extracts key points from longer content - Maintains essential information while reducing length - Useful for condensing complex information

Developer Guide¶

AIClient¶

The core client provides access to AI models through the LiteLLM interface with OpenRouter.

Basic Usage¶

from starstreamer.plugins.litellm import AIClient

# Get client instance (singleton)
client = AIClient.get_instance()

# Generate a response
response = await client.complete("Hello, how are you?")
print(response.content)  # AI's response text
print(response.model)    # Model used for generation
print(response.usage)    # Token usage information

Dependency Injection¶

The client integrates with StarStreamer's dependency injection system:

from starstreamer import on_event
from starstreamer.plugins.litellm import AIClient
from starstreamer.plugins.twitch import TwitchClient
from starstreamer.runtime.types import Event

@on_event("twitch.chat.message")
async def my_ai_handler(event: Event, ai: AIClient, twitch: TwitchClient) -> None:
    # Client is automatically injected
    response = await ai.complete("Tell me a joke")
    await twitch.send_message(f"AI says: {response.content}")

AIResponse Model¶

The AIResponse class represents the response from an AI completion.

AIResponse Properties¶

from starstreamer.plugins.litellm import AIResponse

response = AIResponse(
    content="Hello! I'm doing well, thank you for asking.",
    model="anthropic/claude-3.5-sonnet",
    usage={"prompt_tokens": 10, "completion_tokens": 15},
    finish_reason="stop"
)

print(response.content)        # The AI's response text
print(response.model)          # Model that generated the response
print(response.usage)          # Token usage statistics
print(response.finish_reason)  # Why the generation stopped

Advanced Usage¶

Chat with Conversation History¶

messages = [
    {"role": "user", "content": "What's the weather like?"},
    {"role": "assistant", "content": "I don't have access to current weather data."},
    {"role": "user", "content": "What can you help me with then?"}
]

response = await client.chat(messages)

Custom Parameters¶

# Models are specified at command level, not as overrides
response = await client.complete(
    "Write a haiku about programming",
    model="openrouter/openai/gpt-4o",  # Specify model for this request
    temperature=0.9,                    # More creative
    max_tokens=50                      # Limit response length
)

Module-Level AI Usage¶

Modules can integrate AI functionality for specialized use cases:

from modules.base import BaseModule
from starstreamer.plugins.litellm import AIClient

class RPGModule(BaseModule):
    def __init__(self):
        super().__init__()
        self.ai_system_prompt = """You are a fantasy RPG narrator. 
        Create engaging, brief descriptions for a Twitch stream audience."""

    async def generate_quest(self, ai: AIClient):
        """Generate a random quest description"""
        prompt = f"{self.ai_system_prompt}\n\nGenerate a short fantasy quest:"
        # Use Claude 3.5 Sonnet for creative quest generation
        response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-sonnet")
        return response.content

    async def narrate_action(self, action: str, ai: AIClient):
        """Narrate player actions in RPG style"""
        prompt = f"{self.ai_system_prompt}\n\nNarrate this action: {action}"
        # Use Claude 3.5 Haiku for fast action narration
        response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-haiku")
        return response.content

Integration Architecture¶

Component Overview¶

graph TD
    A[AI Commands] --> B[AIClient]
    B --> C[LiteLLM Library]
    C --> D[OpenRouter API]
    D --> E[AI Models]

    F[Custom Modules] --> B
    F --> G[AIResponse Objects]
    G --> B

    B --> H[Config Manager]
    H --> I[AIConfig]

    subgraph "AI Models via OpenRouter"
        E --> J[Claude 3.5 Sonnet]
        E --> K[GPT-4]
        E --> L[Llama 3.1]
        E --> M[Other Models]
    end

Data Flow¶

Configuration Loading - AIClient loads config from ConfigManager
LiteLLM Initialization - LiteLLM is configured with OpenRouter credentials
Model Selection - Configured model is used for completions
AI Generation - Text prompts are sent to OpenRouter/selected model
Response Processing - Responses are wrapped in AIResponse objects

Dependencies¶

litellm>=1.0.0 - Unified interface for multiple AI providers
httpx>=0.25.0 - HTTP client for API requests
pydantic>=2.5.0 - Configuration validation

Model Selection¶

Architecture Approach¶

StarStreamer uses module-level model selection rather than global configuration. This approach provides:

Optimized Performance - Different commands use models optimized for their specific tasks
Cost Efficiency - Fast models for simple tasks, powerful models for complex ones
Simplified Configuration - Only API key needed in config, no model management complexity

Built-in Command Models¶

StarStreamer's AI commands use carefully selected models for optimal performance:

!ask Command: - Model: openrouter/anthropic/claude-3.5-sonnet - Purpose: Conversational question answering - Optimized for: Quality responses, general knowledge

!explain Command: - Model: openrouter/anthropic/claude-3.5-sonnet - Purpose: Educational explanations - Optimized for: Clear, detailed explanations

!tldr Command: - Model: openrouter/anthropic/claude-3.5-haiku - Purpose: Fast text summarization - Optimized for: Speed and conciseness

Supported Models¶

OpenRouter provides access to many AI models through the openrouter/ prefix:

Anthropic Models (Recommended): - openrouter/anthropic/claude-3.5-sonnet - High-quality conversational AI - openrouter/anthropic/claude-3.5-haiku - Fast, lightweight responses - openrouter/anthropic/claude-3-opus - Most capable model

OpenAI Models: - openrouter/openai/gpt-4o - Latest GPT-4 Omni model - openrouter/openai/gpt-4-turbo - GPT-4 Turbo for faster responses - openrouter/openai/gpt-3.5-turbo - Cost-effective option

Meta Models: - openrouter/meta-llama/llama-3.1-8b-instruct:free - Free tier option - openrouter/meta-llama/llama-3.1-70b-instruct - Large parameter model

Custom Module Model Selection¶

When creating custom AI functionality, specify models based on your use case:

# Fast responses for high-volume commands
response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-haiku")

# High-quality responses for complex tasks  
response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-sonnet")

# Maximum capability for complex reasoning
response = await ai.complete(prompt, model="openrouter/anthropic/claude-3-opus")

# Budget-friendly option
response = await ai.complete(prompt, model="openrouter/meta-llama/llama-3.1-8b-instruct:free")

Best Practices¶

Prompt Engineering¶

Context in Prompts - Include clear context for your stream's AI personality directly in prompts
Concise Instructions - Keep prompts focused for better responses
Stream Context - Include relevant streaming context in prompts

# Good prompt engineering for streaming
prompt = """You are a helpful AI assistant for a Twitch stream about game development.
Be concise, engaging, and keep responses under 200 characters.

Question: How do I optimize my Python code?"""

response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-sonnet")

Performance Optimization¶

Model Selection - Choose models appropriate for your use case
Token Limits - Set reasonable max_tokens to control costs
Caching - Reuse client instances (singleton pattern)
Error Handling - Always handle API failures gracefully

Security Considerations¶

API Key Protection - Store keys in environment variables, never in code
Input Validation - Sanitize user input before sending to AI
Rate Limiting - Respect API rate limits to avoid service interruption
Content Filtering - Filter inappropriate content before and after AI processing

Cost Management¶

Token Monitoring - Track usage through AIResponse.usage
Model Optimization - Use cost-effective models when appropriate
Response Limits - Set max_tokens to control per-request costs
Caching Common Responses - Cache frequent queries to reduce API calls

Troubleshooting¶

Common Issues¶

"AIClient not configured" Error¶

Cause: AI integration is not enabled or API key is missing.

Solution: 1. Ensure ai.openrouter.enabled = true in configuration 2. Set OPENROUTER_API_KEY environment variable 3. Verify API key is valid on OpenRouter

"Invalid model name" Error¶

Cause: Specified model is not available on OpenRouter.

Solution: 1. Check OpenRouter Models for available options 2. Verify model name spelling in configuration 3. Ensure your OpenRouter account has access to the model

Rate Limiting Errors¶

Cause: Exceeded OpenRouter API rate limits.

Solution: 1. Implement cooldowns on AI commands 2. Upgrade OpenRouter plan for higher limits 3. Add request queuing to manage burst traffic

Poor Response Quality¶

Cause: Suboptimal configuration or prompting.

Solution: 1. Adjust temperature (0.7-0.9 for creative, 0.3-0.5 for factual) 2. Improve system prompt with more context 3. Try different models for your use case

Debug Mode¶

Enable debug logging for detailed troubleshooting:

logging:
  level: "DEBUG"

This will show: - AI API request/response details - Token usage information - Error stack traces - Configuration validation

API Reference¶

AIClient¶

Methods¶

get_instance() -> AIClient - Get singleton instance
complete(prompt: str, **kwargs) -> AIResponse - Generate response for single prompt
chat(messages: list[dict], **kwargs) -> AIResponse - Generate response with conversation history

Configuration Parameters¶

api_key: str - OpenRouter API key
enabled: bool - Enable/disable OpenRouter integration (default: false)

Runtime Parameters (per request)¶

model: str - AI model to use (specified in each request)
temperature: float - Response creativity (0.0-2.0, default: 0.7)
max_tokens: int - Maximum response length (default: 150)

AIResponse¶

Properties¶

content: str - The AI's response text
model: str - Model that generated the response
usage: dict[str, Any] - Token usage statistics
finish_reason: str | None - Why generation stopped

Usage Information¶

response = await client.complete("Hello")
tokens_used = response.usage.get("total_tokens", 0)
prompt_tokens = response.usage.get("prompt_tokens", 0)
completion_tokens = response.usage.get("completion_tokens", 0)

Next Steps¶

Configuration Guide - Complete setup instructions
Core Modules - Learn about other available modules
Custom Commands - Create your own AI commands
Module Development - Build modules with AI integration
OpenRouter Models - Browse available AI models