Skip to content

Muse Lightweight Architecture Documentation

Overview

The Muse Lightweight Router represents a hybrid AI agent orchestration system that combines:

  • Lightweight Routing: Fast, efficient request routing to specialist agents
  • Specialization: Different models/providers for different domains
  • Cost Management: Comprehensive cost tracking and optimization
  • Observability: Detailed logging of agent operations and decisions
  • Multi-Tenant Support: Complete tenant isolation with tenant-aware configuration

This document describes the architecture, design decisions, and implementation details of the Muse Lightweight system introduced in Phase 2 (Tasks T13-T23).

Architecture Overview

System Components

┌─────────────────────────────────────────────────────────────────┐
│                     Request Layer                               │
│              (HTTP Controllers, Livewire Events)                │
└──────────────────────────┬──────────────────────────────────────┘


┌─────────────────────────────────────────────────────────────────┐
│            Muse Router (Lightweight Coordinator)                │
│   - Route selection based on request classification           │
│   - Agent delegation and orchestration                        │
│   - Session state management                                  │
└──────────────────────────┬──────────────────────────────────────┘

        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
    ┌────────────┐   ┌────────────┐   ┌────────────┐
    │  SEO Agent │   │ WordPress  │   │ Analytics  │ ...
    │ (Claude)   │   │ Agent      │   │  Agent     │
    │            │   │ (GPT-4o)   │   │ (Claude)   │
    └────────────┘   └────────────┘   └────────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘

        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
    ┌─────────────┐ ┌─────────────┐ ┌──────────────┐
    │ Cost Tracker│ │Observability│ │Session State │
    │             │ │   Logger    │ │  Persistence │
    └─────────────┘ └─────────────┘ └──────────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘


        ┌──────────────────────────────────┐
        │      Tenant AI Config Manager    │
        │  (Per-Agent, Per-Tenant Settings)│
        └──────────────────────────────────┘

Path 4: Hybrid Agent Specialization

Design Rationale

The system implements Path 4 (Hybrid) from the Muse specification:

  • Each agent specializes with a chosen provider and model
  • Configuration is tenant-aware with global fallbacks
  • Costs vary by agent, enabling optimization
  • Performance characteristics differ per agent

Key Benefits

BenefitDescription
Cost OptimizationUse cheaper models for simple tasks, premium models for complex work
Quality FocusMatch model strengths to domain (Claude for analysis, GPT for code)
Multi-Provider SupportAvoid vendor lock-in, use best-in-class for each task
Tenant FlexibilityCustomers can override per-agent models within constraints
Performance ControlDifferent timeout/temperature settings per agent and domain

Core Components

1. TenantAiConfigManager (app/Domains/AI/Services/)

Purpose: Centralized management of agent configuration with tenant awareness.

Key Methods:

php
// Main public method with fallback chain
public function getAgentModelConfig(string $agentName): array

Fallback Chain (per agent):

  1. Check tenant-specific override in database
  2. Fall back to global specialization config
  3. Validate all required keys present
  4. Cache result for performance

Configuration Structure:

php
[
    'provider' => 'anthropic|openai|groq|gemini|ollama',
    'model' => 'claude-opus|gpt-4o|etc',
    'temperature' => 0.2 - 0.9,
    'timeout_seconds' => 5 - 120,
]

Tenant Context: Uses Laravel's tenant() helper function for multi-tenant isolation.

2. UsesAgentSpecializationConfig Trait (app/Domains/AI/Traits/)

Purpose: Enable agents to automatically use specialized configuration.

Usage:

php
class SeoAgent extends BaseLlmAgent {
    use UsesAgentSpecializationConfig;
}

Overridden Methods:

  • getProvider(): string - Returns provider from specialization config
  • getModel(): string - Returns model from specialization config
  • getAgentTemperature(): float - Returns temperature setting
  • getAgentTimeout(): int - Returns timeout in seconds

Caching: Results are cached per request to minimize database queries.

3. AgentCostService (app/Domains/AI/Services/)

Purpose: Track and calculate execution costs across all agents and providers.

Key Methods:

php
// Log execution cost to database
public function logExecution(
    string $agentName,
    string $provider,
    string $model,
    int $inputTokens,
    int $outputTokens,
    ?int $durationMs,
    string $status,
    ?string $errorMessage,
    ?array $metadata,
): void

// Calculate cost in USD for an execution
public function calculateCost(
    string $provider,
    string $model,
    int $inputTokens,
    int $outputTokens,
): float

// Get monthly summary by tenant
public function getCostSummary(
    int $tenantId,
    ?string $month = null,
): array

Pricing Tiers (per 1M tokens):

ProviderModelInputOutput
OpenAIgpt-4-turbo$10$30
OpenAIgpt-4o$5$15
OpenAIgpt-4o-mini$0.15$0.60
Anthropicclaude-opus$15$75
Anthropicclaude-3.5-sonnet$3$15
Anthropicclaude-3-haiku$0.80$4
Groqllama-3.1-8b-instant$0.05$0.10
Groqllama-3.3-70b$0.05$0.10
Gemini1.5-pro$1.50$6
Gemini1.5-flash$0.075$0.30
OllamaAny$0$0

Cost Aggregation: Supports grouping by:

  • Agent name
  • Provider
  • Status (success/timeout/error)
  • Time period (monthly)

4. AgentObservabilityService (app/Domains/AI/Services/)

Purpose: Comprehensive logging for all agent operations and decisions.

Logging Methods:

php
// Core execution logging
public function logAgentExecution(
    string $agentName,
    string $provider,
    string $model,
    int $inputTokens,
    int $outputTokens,
    int $durationMs,
    string $status,
    ?string $errorMessage,
    ?array $metadata,
): void

// Muse → Specialist delegation
public function logAgentDelegation(
    string $fromAgent,
    string $toAgent,
    int $durationMs,
    string $status,
    ?string $resultSummary,
    ?string $errorMessage,
    ?array $metadata,
): void

// Agent learning/memory updates
public function logSessionStateUpdate(
    string $agentName,
    int $factCount,
    int $totalFacts,
    array $facts,
    ?array $metadata,
): void

// Agent startup
public function logAgentInitialization(
    string $agentName,
    string $provider,
    string $model,
    int $loadedFactCount,
    ?array $metadata,
): void

// Muse routing decisions
public function logRoutingDecision(
    string $question,
    string $selectedAgent,
    float $confidence,
    string $reason,
    array $allCandidates,
): void

// Cost tracking
public function logCostEvent(
    string $agentName,
    string $model,
    int $totalTokens,
    float $costUsd,
    string $status,
): void

// Error context
public function logAgentError(
    string $agentName,
    string $errorMessage,
    ?\Throwable $exception,
    ?array $context,
): void

Features:

  • Tenant-aware (includes tenant ID in all logs)
  • Status-based log levels (info/warning/error)
  • Text truncation for large outputs (200 chars for summaries, 100 for facts)
  • Structured JSON metadata for downstream processing
  • Integration with Laravel's logging system

5. AgentCostController (app/Http/Controllers/Admin/)

Purpose: API endpoints for cost dashboard and reporting.

Endpoints:

GET /api/admin/agents/costs?tenant_id=X&month=YYYY-MM
    → Full cost summary

GET /api/admin/agents/costs/by-agent?tenant_id=X&month=YYYY-MM
    → Breakdown by agent

GET /api/admin/agents/costs/by-provider?tenant_id=X&month=YYYY-MM
    → Breakdown by provider

GET /api/admin/agents/costs/by-status?tenant_id=X&month=YYYY-MM
    → Breakdown by status

Response Format:

json
{
  "success": true,
  "data": {
    "month": "2026-01",
    "total_cost_usd": 1234.56,
    "total_tokens": 500000,
    "total_executions": 50,
    "by_agent": {
      "seo_agent": {
        "cost_usd": 523.50,
        "tokens": 200000,
        "executions": 20
      }
    },
    "by_provider": {
      "anthropic": {
        "cost_usd": 600.00,
        "tokens": 280000
      }
    },
    "by_status": {
      "success": {
        "count": 48,
        "cost_usd": 1200.00
      },
      "timeout": {
        "count": 2,
        "cost_usd": 34.56
      }
    }
  }
}

Multi-Tenant Isolation

Tenant Awareness

All services integrate with Laravel's tenancy package:

php
use Stancl\Tenancy\Facades\Tenant;

// In service methods
$tenantId = Tenant::current()->id; // Or use tenant() helper

Isolation Guarantees

  1. Configuration Isolation: Each tenant has separate specialization overrides
  2. Cost Tracking: Costs are tracked per tenant with no cross-tenant visibility
  3. Session State: Agent learning is isolated per tenant
  4. Logging: All logs include tenant context for audit trails
  5. Database Queries: Explicit WHERE tenant_id clauses on all queries

Multi-Tenant Testing

The test suite includes explicit multi-tenant scenarios:

  • Same agent configuration across different tenants
  • Parallel execution by different tenants
  • Cost isolation verification
  • No data leakage between tenants

Configuration System

Global Agent Specialization (config/agent_specialization.php)

Defines default configuration for all 15+ agents:

php
return [
    'seo_agent' => [
        'provider' => 'anthropic',
        'model' => 'claude-opus',
        'temperature' => 0.3,
        'timeout_seconds' => 30,
    ],
    // ... more agents
];

Tenant Overrides (Database - Future Enhancement)

Supports per-tenant customization:

  • Override any agent's provider/model
  • Custom temperature or timeout
  • Fallback to global if not set

Database Schema (future):

sql
CREATE TABLE tenant_agent_configurations (
    id BIGINT PRIMARY KEY,
    tenant_id BIGINT,
    agent_name VARCHAR(255),
    provider VARCHAR(50),
    model VARCHAR(255),
    temperature FLOAT,
    timeout_seconds INT,
    UNIQUE(tenant_id, agent_name)
);

Performance Characteristics

Measured Performance (Phase 3 Benchmarks)

OperationThresholdActualStatus
Agent config loading<50ms~2-5ms
Cost calculation<10ms~1-3ms
Session state operation<50ms~5-20ms
Routing decision (single)<40ms~5-8ms
Routing for 50 decisions<2000ms~300-400ms
Record batch insert (100)<50ms avg~25-35ms avg
Cost aggregation (80 records)<100ms~20-50ms

Optimization Strategies

  1. Configuration Caching: Results cached per request
  2. Batch Operations: Support for batch cost logging
  3. Lazy Loading: Agents load config only when needed
  4. Provider Efficiency: Use provider strengths (cost vs quality trade-off)
  5. Database Indexing: Indexes on (tenant_id, agent_name), (tenant_id, created_at)

Integration Patterns

Pattern 1: Simple Agent Execution

php
// Agent automatically uses specialization config
$agent = new SeoAgent();
$result = $agent->execute($request);

// Cost is automatically tracked
$costService->logExecution(
    agentName: 'seo_agent',
    provider: 'anthropic',
    model: 'claude-opus',
    inputTokens: $tokenCount['input'],
    outputTokens: $tokenCount['output'],
    durationMs: $duration,
    status: 'success',
);

Pattern 2: Specialist Delegation (Muse → Agent)

php
// Muse routes to appropriate specialist
$selectedAgent = 'seo_agent'; // Based on question classification

$config = $configManager->getAgentModelConfig($selectedAgent);
$agent = AgentFactory::create($selectedAgent, $config);

$result = $agent->execute($userRequest);

// Log delegation
$observability->logAgentDelegation(
    fromAgent: 'muse_mode_agent',
    toAgent: $selectedAgent,
    durationMs: $duration,
    status: 'success',
    resultSummary: substr($result, 0, 200),
);

Pattern 3: Multi-Step Workflow

php
// Track each step independently
$session = new AgentSession($tenantId, $sessionId);

foreach ($steps as $step) {
    $agent = AgentFactory::create($step['agent'], $configManager);
    $result = $agent->execute($step['input']);
    
    $costService->logExecution(...);
    $observability->logAgentExecution(...);
    $session->recordStep($step, $result);
}

// Aggregated cost
$totalCost = $costService->getCostSummary($tenantId);

Database Schema

AgentExecutionCost Table

sql
CREATE TABLE agent_execution_costs (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    tenant_id BIGINT NOT NULL,
    agent_name VARCHAR(255) NOT NULL,
    provider VARCHAR(50) NOT NULL,
    model VARCHAR(255) NOT NULL,
    input_tokens INT NOT NULL DEFAULT 0,
    output_tokens INT NOT NULL DEFAULT 0,
    cost_usd DECIMAL(10, 6) NOT NULL DEFAULT 0,
    duration_ms INT,
    status VARCHAR(50) NOT NULL DEFAULT 'success',
    error_message TEXT,
    metadata JSON,
    created_at TIMESTAMP,
    updated_at TIMESTAMP,
    
    INDEX (tenant_id, agent_name),
    INDEX (tenant_id, created_at),
    INDEX (agent_name, created_at),
    INDEX (provider, model),
);

Indexes Rationale:

  • (tenant_id, agent_name): Common query for agent-specific costs per tenant
  • (tenant_id, created_at): Range queries for date filtering per tenant
  • (agent_name, created_at): Global agent performance analysis
  • (provider, model): Provider-wide reporting

Testing Strategy

Test Hierarchy

  1. Unit Tests (Tests/Unit/):

    • TenantAiConfigManager: 13 tests
    • UsesAgentSpecializationConfig: 16 tests
    • SpecialistAgentModels: 6 tests
    • AgentCostService: 13 tests
    • AgentObservabilityService: 12 tests
  2. Feature Tests (Tests/Feature/):

    • AgentCostController: 8 tests
    • AgentWorkflowIntegration: 7 tests
    • AgentPerformanceBenchmark: 8 tests
    • AgentRegression: 6 real-world workflows

Test Coverage

  • Configuration: All agents, all providers
  • Cost Calculation: All pricing tiers, edge cases
  • Multi-Tenant: Isolation, no data leakage
  • Performance: All critical thresholds
  • Workflows: 6 realistic scenarios (blog creation, e-commerce, email campaigns, etc.)

Total: 89 tests, 588 assertions

Migration Path & Future Enhancements

Completed (Phase 2-3)

✅ Per-agent model specialization ✅ Tenant-aware configuration ✅ Cost tracking and calculation ✅ Observability logging ✅ Performance benchmarking ✅ End-to-end regression testing

Planned (Phase 4+)

  • Tenant configuration overrides (per-agent customization in database)
  • Advanced cost optimization (agent recommendation based on cost vs quality)
  • Session state persistence (cross-request agent learning)
  • Real-time cost alerts (warn when approaching budget)
  • Comparative analytics (agent A vs B performance)
  • Automated provider switching based on availability

Troubleshooting

Issue: Configuration Not Loading

Cause: Missing agent in config/agent_specialization.php

Solution:

  1. Add agent to config/agent_specialization.php
  2. Use getAgentModelConfig() with correct agent name
  3. Verify config syntax

Issue: High Token Counts

Cause: Inefficient prompts or large input documents

Solution:

  1. Optimize prompt engineering
  2. Implement token counting before agent execution
  3. Consider model selection (use cheaper models for simple tasks)

Issue: Tenant Isolation Failures

Cause: Missing tenant context or direct database queries

Solution:

  1. Verify tenant() helper is available
  2. Add explicit WHERE tenant_id clauses
  3. Use services that handle tenant context automatically

References


Last Updated: 2026-05-09 Version: 1.0 Status: Production Ready (Phase 2-3 Complete)