Skip to content

AI Integration with OpenRouter

StarStreamer provides AI-powered chat interaction through the integration of LiteLLM with OpenRouter, enabling streamers to add intelligent conversation capabilities to their streams. This integration provides access to multiple AI models through a unified interface.

Overview

The AI integration is built on LiteLLM, which provides a unified interface to multiple AI providers, specifically configured to use OpenRouter as the primary provider. OpenRouter offers access to various state-of-the-art AI models including Claude, GPT-4, Llama, and many others.

Key Features

  • Multi-Model Support - Access to various AI models through OpenRouter (Claude, GPT-4, Llama, etc.)
  • Unified LiteLLM Interface - Consistent API regardless of the underlying model
  • Built-in Chat Commands - Ready-to-use AI commands for streamers
  • Singleton Client Pattern - Efficient resource management and configuration
  • Dependency Injection - Seamless integration with StarStreamer's DI system
  • Error Handling - Robust error handling with user-friendly messages

Quick Start

1. Configuration

Add AI configuration to your config.yaml:

# AI Integration using LiteLLM with OpenRouter
# Get your API key from https://openrouter.ai/keys
# Models are specified at the module level, not in configuration
ai:
  openrouter:
    enabled: false                         # Set to true to enable OpenRouter integration
    api_key: "${OPENROUTER_API_KEY}"      # OpenRouter API key (set via environment variable)

2. Get Your OpenRouter API Key

  1. Sign up at OpenRouter
  2. Go to API Keys
  3. Create a new API key
  4. Set the OPENROUTER_API_KEY environment variable

3. Start Using AI Commands

Once configured, the following commands are available in chat:

!ask How does machine learning work?
!explain quantum computing
!tldr Machine learning is a subset of artificial intelligence...

Chat Commands

!ask \<question>

Asks the AI a question and returns a conversational response.

Usage:

!ask What is the best programming language for beginners?
!ask How do I improve at streaming?
!ask Can you explain blockchain in simple terms?

Features: - Conversational AI responses tailored for Twitch chat - Concise answers optimized for stream interaction - Context-aware responses based on the configured system prompt - Error handling for API failures

!explain \<topic>

Requests an educational explanation of a topic or concept.

Usage:

!explain artificial intelligence
!explain how photosynthesis works
!explain the stock market

Features: - Educational focus with simple, accessible explanations - Optimized for general audiences - Concise responses suitable for chat interaction - Covers a wide range of topics

!tldr \<text>

Provides a concise summary (TL;DR) of the provided text.

Usage:

!tldr The quick brown fox jumps over the lazy dog. This pangram contains every letter of the English alphabet at least once and is commonly used for testing typefaces and keyboards.
!tldr <long article or explanation>

Features: - Text summarization capabilities - Extracts key points from longer content - Maintains essential information while reducing length - Useful for condensing complex information

Developer Guide

AIClient

The core client provides access to AI models through the LiteLLM interface with OpenRouter.

Basic Usage

from starstreamer.plugins.litellm import AIClient

# Get client instance (singleton)
client = AIClient.get_instance()

# Generate a response
response = await client.complete("Hello, how are you?")
print(response.content)  # AI's response text
print(response.model)    # Model used for generation
print(response.usage)    # Token usage information

Dependency Injection

The client integrates with StarStreamer's dependency injection system:

from starstreamer import on_event
from starstreamer.plugins.litellm import AIClient
from starstreamer.plugins.twitch import TwitchClient
from starstreamer.runtime.types import Event

@on_event("twitch.chat.message")
async def my_ai_handler(event: Event, ai: AIClient, twitch: TwitchClient) -> None:
    # Client is automatically injected
    response = await ai.complete("Tell me a joke")
    await twitch.send_message(f"AI says: {response.content}")

AIResponse Model

The AIResponse class represents the response from an AI completion.

AIResponse Properties

from starstreamer.plugins.litellm import AIResponse

response = AIResponse(
    content="Hello! I'm doing well, thank you for asking.",
    model="anthropic/claude-3.5-sonnet",
    usage={"prompt_tokens": 10, "completion_tokens": 15},
    finish_reason="stop"
)

print(response.content)        # The AI's response text
print(response.model)          # Model that generated the response
print(response.usage)          # Token usage statistics
print(response.finish_reason)  # Why the generation stopped

Advanced Usage

Chat with Conversation History

messages = [
    {"role": "user", "content": "What's the weather like?"},
    {"role": "assistant", "content": "I don't have access to current weather data."},
    {"role": "user", "content": "What can you help me with then?"}
]

response = await client.chat(messages)

Custom Parameters

# Models are specified at command level, not as overrides
response = await client.complete(
    "Write a haiku about programming",
    model="openrouter/openai/gpt-4o",  # Specify model for this request
    temperature=0.9,                    # More creative
    max_tokens=50                      # Limit response length
)

Module-Level AI Usage

Modules can integrate AI functionality for specialized use cases:

from modules.base import BaseModule
from starstreamer.plugins.litellm import AIClient

class RPGModule(BaseModule):
    def __init__(self):
        super().__init__()
        self.ai_system_prompt = """You are a fantasy RPG narrator. 
        Create engaging, brief descriptions for a Twitch stream audience."""

    async def generate_quest(self, ai: AIClient):
        """Generate a random quest description"""
        prompt = f"{self.ai_system_prompt}\n\nGenerate a short fantasy quest:"
        # Use Claude 3.5 Sonnet for creative quest generation
        response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-sonnet")
        return response.content

    async def narrate_action(self, action: str, ai: AIClient):
        """Narrate player actions in RPG style"""
        prompt = f"{self.ai_system_prompt}\n\nNarrate this action: {action}"
        # Use Claude 3.5 Haiku for fast action narration
        response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-haiku")
        return response.content

Integration Architecture

Component Overview

graph TD
    A[AI Commands] --> B[AIClient]
    B --> C[LiteLLM Library]
    C --> D[OpenRouter API]
    D --> E[AI Models]

    F[Custom Modules] --> B
    F --> G[AIResponse Objects]
    G --> B

    B --> H[Config Manager]
    H --> I[AIConfig]

    subgraph "AI Models via OpenRouter"
        E --> J[Claude 3.5 Sonnet]
        E --> K[GPT-4]
        E --> L[Llama 3.1]
        E --> M[Other Models]
    end

Data Flow

  1. Configuration Loading - AIClient loads config from ConfigManager
  2. LiteLLM Initialization - LiteLLM is configured with OpenRouter credentials
  3. Model Selection - Configured model is used for completions
  4. AI Generation - Text prompts are sent to OpenRouter/selected model
  5. Response Processing - Responses are wrapped in AIResponse objects

Dependencies

  • litellm>=1.0.0 - Unified interface for multiple AI providers
  • httpx>=0.25.0 - HTTP client for API requests
  • pydantic>=2.5.0 - Configuration validation

Model Selection

Architecture Approach

StarStreamer uses module-level model selection rather than global configuration. This approach provides:

  • Optimized Performance - Different commands use models optimized for their specific tasks
  • Cost Efficiency - Fast models for simple tasks, powerful models for complex ones
  • Simplified Configuration - Only API key needed in config, no model management complexity

Built-in Command Models

StarStreamer's AI commands use carefully selected models for optimal performance:

!ask Command: - Model: openrouter/anthropic/claude-3.5-sonnet - Purpose: Conversational question answering - Optimized for: Quality responses, general knowledge

!explain Command: - Model: openrouter/anthropic/claude-3.5-sonnet - Purpose: Educational explanations - Optimized for: Clear, detailed explanations

!tldr Command: - Model: openrouter/anthropic/claude-3.5-haiku - Purpose: Fast text summarization - Optimized for: Speed and conciseness

Supported Models

OpenRouter provides access to many AI models through the openrouter/ prefix:

Anthropic Models (Recommended): - openrouter/anthropic/claude-3.5-sonnet - High-quality conversational AI - openrouter/anthropic/claude-3.5-haiku - Fast, lightweight responses - openrouter/anthropic/claude-3-opus - Most capable model

OpenAI Models: - openrouter/openai/gpt-4o - Latest GPT-4 Omni model - openrouter/openai/gpt-4-turbo - GPT-4 Turbo for faster responses - openrouter/openai/gpt-3.5-turbo - Cost-effective option

Meta Models: - openrouter/meta-llama/llama-3.1-8b-instruct:free - Free tier option - openrouter/meta-llama/llama-3.1-70b-instruct - Large parameter model

Custom Module Model Selection

When creating custom AI functionality, specify models based on your use case:

# Fast responses for high-volume commands
response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-haiku")

# High-quality responses for complex tasks  
response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-sonnet")

# Maximum capability for complex reasoning
response = await ai.complete(prompt, model="openrouter/anthropic/claude-3-opus")

# Budget-friendly option
response = await ai.complete(prompt, model="openrouter/meta-llama/llama-3.1-8b-instruct:free")

Best Practices

Prompt Engineering

  1. Context in Prompts - Include clear context for your stream's AI personality directly in prompts
  2. Concise Instructions - Keep prompts focused for better responses
  3. Stream Context - Include relevant streaming context in prompts
# Good prompt engineering for streaming
prompt = """You are a helpful AI assistant for a Twitch stream about game development.
Be concise, engaging, and keep responses under 200 characters.

Question: How do I optimize my Python code?"""

response = await ai.complete(prompt, model="openrouter/anthropic/claude-3.5-sonnet")

Performance Optimization

  1. Model Selection - Choose models appropriate for your use case
  2. Token Limits - Set reasonable max_tokens to control costs
  3. Caching - Reuse client instances (singleton pattern)
  4. Error Handling - Always handle API failures gracefully

Security Considerations

  1. API Key Protection - Store keys in environment variables, never in code
  2. Input Validation - Sanitize user input before sending to AI
  3. Rate Limiting - Respect API rate limits to avoid service interruption
  4. Content Filtering - Filter inappropriate content before and after AI processing

Cost Management

  1. Token Monitoring - Track usage through AIResponse.usage
  2. Model Optimization - Use cost-effective models when appropriate
  3. Response Limits - Set max_tokens to control per-request costs
  4. Caching Common Responses - Cache frequent queries to reduce API calls

Troubleshooting

Common Issues

"AIClient not configured" Error

Cause: AI integration is not enabled or API key is missing.

Solution: 1. Ensure ai.openrouter.enabled = true in configuration 2. Set OPENROUTER_API_KEY environment variable 3. Verify API key is valid on OpenRouter

"Invalid model name" Error

Cause: Specified model is not available on OpenRouter.

Solution: 1. Check OpenRouter Models for available options 2. Verify model name spelling in configuration 3. Ensure your OpenRouter account has access to the model

Rate Limiting Errors

Cause: Exceeded OpenRouter API rate limits.

Solution: 1. Implement cooldowns on AI commands 2. Upgrade OpenRouter plan for higher limits 3. Add request queuing to manage burst traffic

Poor Response Quality

Cause: Suboptimal configuration or prompting.

Solution: 1. Adjust temperature (0.7-0.9 for creative, 0.3-0.5 for factual) 2. Improve system prompt with more context 3. Try different models for your use case

Debug Mode

Enable debug logging for detailed troubleshooting:

logging:
  level: "DEBUG"

This will show: - AI API request/response details - Token usage information - Error stack traces - Configuration validation

API Reference

AIClient

Methods

  • get_instance() -> AIClient - Get singleton instance
  • complete(prompt: str, **kwargs) -> AIResponse - Generate response for single prompt
  • chat(messages: list[dict], **kwargs) -> AIResponse - Generate response with conversation history

Configuration Parameters

  • api_key: str - OpenRouter API key
  • enabled: bool - Enable/disable OpenRouter integration (default: false)

Runtime Parameters (per request)

  • model: str - AI model to use (specified in each request)
  • temperature: float - Response creativity (0.0-2.0, default: 0.7)
  • max_tokens: int - Maximum response length (default: 150)

AIResponse

Properties

  • content: str - The AI's response text
  • model: str - Model that generated the response
  • usage: dict[str, Any] - Token usage statistics
  • finish_reason: str | None - Why generation stopped

Usage Information

response = await client.complete("Hello")
tokens_used = response.usage.get("total_tokens", 0)
prompt_tokens = response.usage.get("prompt_tokens", 0)
completion_tokens = response.usage.get("completion_tokens", 0)

Next Steps