Muse Lightweight Architecture Documentation

Overview

The Muse Lightweight Router represents a hybrid AI agent orchestration system that combines:

Lightweight Routing: Fast, efficient request routing to specialist agents
Specialization: Different models/providers for different domains
Cost Management: Comprehensive cost tracking and optimization
Observability: Detailed logging of agent operations and decisions
Multi-Tenant Support: Complete tenant isolation with tenant-aware configuration

This document describes the architecture, design decisions, and implementation details of the Muse Lightweight system introduced in Phase 2 (Tasks T13-T23).

Architecture Overview

System Components

┌─────────────────────────────────────────────────────────────────┐
│                     Request Layer                               │
│              (HTTP Controllers, Livewire Events)                │
└──────────────────────────┬──────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│            Muse Router (Lightweight Coordinator)                │
│   - Route selection based on request classification           │
│   - Agent delegation and orchestration                        │
│   - Session state management                                  │
└──────────────────────────┬──────────────────────────────────────┘
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
    ┌────────────┐   ┌────────────┐   ┌────────────┐
    │  SEO Agent │   │ WordPress  │   │ Analytics  │ ...
    │ (Claude)   │   │ Agent      │   │  Agent     │
    │            │   │ (GPT-4o)   │   │ (Claude)   │
    └────────────┘   └────────────┘   └────────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘
                           │
        ┌──────────────────┼──────────────────┐
        ▼                  ▼                  ▼
    ┌─────────────┐ ┌─────────────┐ ┌──────────────┐
    │ Cost Tracker│ │Observability│ │Session State │
    │             │ │   Logger    │ │  Persistence │
    └─────────────┘ └─────────────┘ └──────────────┘
        │                  │                  │
        └──────────────────┼──────────────────┘
                           │
                           ▼
        ┌──────────────────────────────────┐
        │      Tenant AI Config Manager    │
        │  (Per-Agent, Per-Tenant Settings)│
        └──────────────────────────────────┘

Path 4: Hybrid Agent Specialization

Design Rationale

The system implements Path 4 (Hybrid) from the Muse specification:

Each agent specializes with a chosen provider and model
Configuration is tenant-aware with global fallbacks
Costs vary by agent, enabling optimization
Performance characteristics differ per agent

Key Benefits

Benefit	Description
Cost Optimization	Use cheaper models for simple tasks, premium models for complex work
Quality Focus	Match model strengths to domain (Claude for analysis, GPT for code)
Multi-Provider Support	Avoid vendor lock-in, use best-in-class for each task
Tenant Flexibility	Customers can override per-agent models within constraints
Performance Control	Different timeout/temperature settings per agent and domain

Core Components

1. TenantAiConfigManager (app/Domains/AI/Services/)

Purpose: Centralized management of agent configuration with tenant awareness.

Key Methods:

php

// Main public method with fallback chain
public function getAgentModelConfig(string $agentName): array

Fallback Chain (per agent):

Check tenant-specific override in database
Fall back to global specialization config
Validate all required keys present
Cache result for performance

Configuration Structure:

php

[
    'provider' => 'anthropic|openai|groq|gemini|ollama',
    'model' => 'claude-opus|gpt-4o|etc',
    'temperature' => 0.2 - 0.9,
    'timeout_seconds' => 5 - 120,
]

Tenant Context: Uses Laravel's tenant() helper function for multi-tenant isolation.

2. UsesAgentSpecializationConfig Trait (app/Domains/AI/Traits/)

Purpose: Enable agents to automatically use specialized configuration.

Usage:

php

class SeoAgent extends BaseLlmAgent {
    use UsesAgentSpecializationConfig;
}

Overridden Methods:

getProvider(): string - Returns provider from specialization config
getModel(): string - Returns model from specialization config
getAgentTemperature(): float - Returns temperature setting
getAgentTimeout(): int - Returns timeout in seconds

Caching: Results are cached per request to minimize database queries.

3. AgentCostService (app/Domains/AI/Services/)

Purpose: Track and calculate execution costs across all agents and providers.

Key Methods:

php

// Log execution cost to database
public function logExecution(
    string $agentName,
    string $provider,
    string $model,
    int $inputTokens,
    int $outputTokens,
    ?int $durationMs,
    string $status,
    ?string $errorMessage,
    ?array $metadata,
): void

// Calculate cost in USD for an execution
public function calculateCost(
    string $provider,
    string $model,
    int $inputTokens,
    int $outputTokens,
): float

// Get monthly summary by tenant
public function getCostSummary(
    int $tenantId,
    ?string $month = null,
): array

Pricing Tiers (per 1M tokens):

Provider	Model	Input	Output
OpenAI	gpt-4-turbo	$10	$30
OpenAI	gpt-4o	$5	$15
OpenAI	gpt-4o-mini	$0.15	$0.60
Anthropic	claude-opus	$15	$75
Anthropic	claude-3.5-sonnet	$3	$15
Anthropic	claude-3-haiku	$0.80	$4
Groq	llama-3.1-8b-instant	$0.05	$0.10
Groq	llama-3.3-70b	$0.05	$0.10
Gemini	1.5-pro	$1.50	$6
Gemini	1.5-flash	$0.075	$0.30
Ollama	Any	$0	$0

Cost Aggregation: Supports grouping by:

Agent name
Provider
Status (success/timeout/error)
Time period (monthly)

4. AgentObservabilityService (app/Domains/AI/Services/)

Purpose: Comprehensive logging for all agent operations and decisions.

Logging Methods:

php

// Core execution logging
public function logAgentExecution(
    string $agentName,
    string $provider,
    string $model,
    int $inputTokens,
    int $outputTokens,
    int $durationMs,
    string $status,
    ?string $errorMessage,
    ?array $metadata,
): void

// Muse → Specialist delegation
public function logAgentDelegation(
    string $fromAgent,
    string $toAgent,
    int $durationMs,
    string $status,
    ?string $resultSummary,
    ?string $errorMessage,
    ?array $metadata,
): void

// Agent learning/memory updates
public function logSessionStateUpdate(
    string $agentName,
    int $factCount,
    int $totalFacts,
    array $facts,
    ?array $metadata,
): void

// Agent startup
public function logAgentInitialization(
    string $agentName,
    string $provider,
    string $model,
    int $loadedFactCount,
    ?array $metadata,
): void

// Muse routing decisions
public function logRoutingDecision(
    string $question,
    string $selectedAgent,
    float $confidence,
    string $reason,
    array $allCandidates,
): void

// Cost tracking
public function logCostEvent(
    string $agentName,
    string $model,
    int $totalTokens,
    float $costUsd,
    string $status,
): void

// Error context
public function logAgentError(
    string $agentName,
    string $errorMessage,
    ?\Throwable $exception,
    ?array $context,
): void

Features:

Tenant-aware (includes tenant ID in all logs)
Status-based log levels (info/warning/error)
Text truncation for large outputs (200 chars for summaries, 100 for facts)
Structured JSON metadata for downstream processing
Integration with Laravel's logging system

5. AgentCostController (app/Http/Controllers/Admin/)

Purpose: API endpoints for cost dashboard and reporting.

Endpoints:

GET /api/admin/agents/costs?tenant_id=X&month=YYYY-MM
    → Full cost summary

GET /api/admin/agents/costs/by-agent?tenant_id=X&month=YYYY-MM
    → Breakdown by agent

GET /api/admin/agents/costs/by-provider?tenant_id=X&month=YYYY-MM
    → Breakdown by provider

GET /api/admin/agents/costs/by-status?tenant_id=X&month=YYYY-MM
    → Breakdown by status

Response Format:

json

{
  "success": true,
  "data": {
    "month": "2026-01",
    "total_cost_usd": 1234.56,
    "total_tokens": 500000,
    "total_executions": 50,
    "by_agent": {
      "seo_agent": {
        "cost_usd": 523.50,
        "tokens": 200000,
        "executions": 20
      }
    },
    "by_provider": {
      "anthropic": {
        "cost_usd": 600.00,
        "tokens": 280000
      }
    },
    "by_status": {
      "success": {
        "count": 48,
        "cost_usd": 1200.00
      },
      "timeout": {
        "count": 2,
        "cost_usd": 34.56
      }
    }
  }
}

Multi-Tenant Isolation

Tenant Awareness

All services integrate with Laravel's tenancy package:

php

use Stancl\Tenancy\Facades\Tenant;

// In service methods
$tenantId = Tenant::current()->id; // Or use tenant() helper

Isolation Guarantees

Configuration Isolation: Each tenant has separate specialization overrides
Cost Tracking: Costs are tracked per tenant with no cross-tenant visibility
Session State: Agent learning is isolated per tenant
Logging: All logs include tenant context for audit trails
Database Queries: Explicit WHERE tenant_id clauses on all queries

Multi-Tenant Testing

The test suite includes explicit multi-tenant scenarios:

Same agent configuration across different tenants
Parallel execution by different tenants
Cost isolation verification
No data leakage between tenants

Configuration System

Global Agent Specialization (config/agent_specialization.php)

Defines default configuration for all 15+ agents:

php

return [
    'seo_agent' => [
        'provider' => 'anthropic',
        'model' => 'claude-opus',
        'temperature' => 0.3,
        'timeout_seconds' => 30,
    ],
    // ... more agents
];

Tenant Overrides (Database - Future Enhancement)

Supports per-tenant customization:

Override any agent's provider/model
Custom temperature or timeout
Fallback to global if not set

Database Schema (future):

sql

CREATE TABLE tenant_agent_configurations (
    id BIGINT PRIMARY KEY,
    tenant_id BIGINT,
    agent_name VARCHAR(255),
    provider VARCHAR(50),
    model VARCHAR(255),
    temperature FLOAT,
    timeout_seconds INT,
    UNIQUE(tenant_id, agent_name)
);

Performance Characteristics

Measured Performance (Phase 3 Benchmarks)

Operation	Threshold	Actual	Status
Agent config loading	<50ms	~2-5ms	✅
Cost calculation	<10ms	~1-3ms	✅
Session state operation	<50ms	~5-20ms	✅
Routing decision (single)	<40ms	~5-8ms	✅
Routing for 50 decisions	<2000ms	~300-400ms	✅
Record batch insert (100)	<50ms avg	~25-35ms avg	✅
Cost aggregation (80 records)	<100ms	~20-50ms	✅

Optimization Strategies

Configuration Caching: Results cached per request
Batch Operations: Support for batch cost logging
Lazy Loading: Agents load config only when needed
Provider Efficiency: Use provider strengths (cost vs quality trade-off)
Database Indexing: Indexes on (tenant_id, agent_name), (tenant_id, created_at)

Integration Patterns

Pattern 1: Simple Agent Execution

php

// Agent automatically uses specialization config
$agent = new SeoAgent();
$result = $agent->execute($request);

// Cost is automatically tracked
$costService->logExecution(
    agentName: 'seo_agent',
    provider: 'anthropic',
    model: 'claude-opus',
    inputTokens: $tokenCount['input'],
    outputTokens: $tokenCount['output'],
    durationMs: $duration,
    status: 'success',
);

Pattern 2: Specialist Delegation (Muse → Agent)

php

// Muse routes to appropriate specialist
$selectedAgent = 'seo_agent'; // Based on question classification

$config = $configManager->getAgentModelConfig($selectedAgent);
$agent = AgentFactory::create($selectedAgent, $config);

$result = $agent->execute($userRequest);

// Log delegation
$observability->logAgentDelegation(
    fromAgent: 'muse_mode_agent',
    toAgent: $selectedAgent,
    durationMs: $duration,
    status: 'success',
    resultSummary: substr($result, 0, 200),
);

Pattern 3: Multi-Step Workflow

php

// Track each step independently
$session = new AgentSession($tenantId, $sessionId);

foreach ($steps as $step) {
    $agent = AgentFactory::create($step['agent'], $configManager);
    $result = $agent->execute($step['input']);
    
    $costService->logExecution(...);
    $observability->logAgentExecution(...);
    $session->recordStep($step, $result);
}

// Aggregated cost
$totalCost = $costService->getCostSummary($tenantId);

Database Schema

AgentExecutionCost Table

sql

CREATE TABLE agent_execution_costs (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    tenant_id BIGINT NOT NULL,
    agent_name VARCHAR(255) NOT NULL,
    provider VARCHAR(50) NOT NULL,
    model VARCHAR(255) NOT NULL,
    input_tokens INT NOT NULL DEFAULT 0,
    output_tokens INT NOT NULL DEFAULT 0,
    cost_usd DECIMAL(10, 6) NOT NULL DEFAULT 0,
    duration_ms INT,
    status VARCHAR(50) NOT NULL DEFAULT 'success',
    error_message TEXT,
    metadata JSON,
    created_at TIMESTAMP,
    updated_at TIMESTAMP,
    
    INDEX (tenant_id, agent_name),
    INDEX (tenant_id, created_at),
    INDEX (agent_name, created_at),
    INDEX (provider, model),
);

Indexes Rationale:

(tenant_id, agent_name): Common query for agent-specific costs per tenant
(tenant_id, created_at): Range queries for date filtering per tenant
(agent_name, created_at): Global agent performance analysis
(provider, model): Provider-wide reporting

Testing Strategy

Test Hierarchy

Unit Tests (Tests/Unit/):
- TenantAiConfigManager: 13 tests
- UsesAgentSpecializationConfig: 16 tests
- SpecialistAgentModels: 6 tests
- AgentCostService: 13 tests
- AgentObservabilityService: 12 tests
Feature Tests (Tests/Feature/):
- AgentCostController: 8 tests
- AgentWorkflowIntegration: 7 tests
- AgentPerformanceBenchmark: 8 tests
- AgentRegression: 6 real-world workflows

Test Coverage

Configuration: All agents, all providers
Cost Calculation: All pricing tiers, edge cases
Multi-Tenant: Isolation, no data leakage
Performance: All critical thresholds
Workflows: 6 realistic scenarios (blog creation, e-commerce, email campaigns, etc.)

Total: 89 tests, 588 assertions

Migration Path & Future Enhancements

Completed (Phase 2-3)

✅ Per-agent model specialization ✅ Tenant-aware configuration ✅ Cost tracking and calculation ✅ Observability logging ✅ Performance benchmarking ✅ End-to-end regression testing

Planned (Phase 4+)

Tenant configuration overrides (per-agent customization in database)
Advanced cost optimization (agent recommendation based on cost vs quality)
Session state persistence (cross-request agent learning)
Real-time cost alerts (warn when approaching budget)
Comparative analytics (agent A vs B performance)
Automated provider switching based on availability

Troubleshooting

Issue: Configuration Not Loading

Cause: Missing agent in config/agent_specialization.php

Solution:

Add agent to config/agent_specialization.php
Use getAgentModelConfig() with correct agent name
Verify config syntax

Issue: High Token Counts

Cause: Inefficient prompts or large input documents

Solution:

Optimize prompt engineering
Implement token counting before agent execution
Consider model selection (use cheaper models for simple tasks)

Issue: Tenant Isolation Failures

Cause: Missing tenant context or direct database queries

Solution:

Verify tenant() helper is available
Add explicit WHERE tenant_id clauses
Use services that handle tenant context automatically

References

Configuration: config/agent_specialization.php
Services: app/Domains/AI/Services/
Tests: tests/Feature/AI/
Database: database/migrations/agent_execution_costs

Last Updated: 2026-05-09 Version: 1.0 Status: Production Ready (Phase 2-3 Complete)

Muse Lightweight Architecture Documentation ​

Overview ​

Architecture Overview ​

System Components ​

Path 4: Hybrid Agent Specialization ​

Design Rationale ​

Key Benefits ​

Core Components ​

1. TenantAiConfigManager (app/Domains/AI/Services/) ​

2. UsesAgentSpecializationConfig Trait (app/Domains/AI/Traits/) ​

3. AgentCostService (app/Domains/AI/Services/) ​

4. AgentObservabilityService (app/Domains/AI/Services/) ​

5. AgentCostController (app/Http/Controllers/Admin/) ​

Multi-Tenant Isolation ​

Tenant Awareness ​

Isolation Guarantees ​

Multi-Tenant Testing ​

Configuration System ​

Global Agent Specialization (config/agent_specialization.php) ​

Tenant Overrides (Database - Future Enhancement) ​

Performance Characteristics ​

Measured Performance (Phase 3 Benchmarks) ​

Optimization Strategies ​

Integration Patterns ​

Pattern 1: Simple Agent Execution ​

Pattern 2: Specialist Delegation (Muse → Agent) ​

Pattern 3: Multi-Step Workflow ​

Database Schema ​

AgentExecutionCost Table ​

Testing Strategy ​

Test Hierarchy ​

Test Coverage ​

Migration Path & Future Enhancements ​

Completed (Phase 2-3) ​

Planned (Phase 4+) ​

Troubleshooting ​

Issue: Configuration Not Loading ​

Issue: High Token Counts ​

Issue: Tenant Isolation Failures ​

References ​

Muse Lightweight Architecture Documentation

Overview

Architecture Overview

System Components

Path 4: Hybrid Agent Specialization

Design Rationale

Key Benefits

Core Components

1. TenantAiConfigManager (app/Domains/AI/Services/)

2. UsesAgentSpecializationConfig Trait (app/Domains/AI/Traits/)

3. AgentCostService (app/Domains/AI/Services/)

4. AgentObservabilityService (app/Domains/AI/Services/)

5. AgentCostController (app/Http/Controllers/Admin/)

Multi-Tenant Isolation

Tenant Awareness

Isolation Guarantees

Multi-Tenant Testing

Configuration System

Global Agent Specialization (config/agent_specialization.php)

Tenant Overrides (Database - Future Enhancement)

Performance Characteristics

Measured Performance (Phase 3 Benchmarks)

Optimization Strategies

Integration Patterns

Pattern 1: Simple Agent Execution

Pattern 2: Specialist Delegation (Muse → Agent)

Pattern 3: Multi-Step Workflow

Database Schema

AgentExecutionCost Table

Testing Strategy

Test Hierarchy

Test Coverage

Migration Path & Future Enhancements

Completed (Phase 2-3)

Planned (Phase 4+)

Troubleshooting

Issue: Configuration Not Loading

Issue: High Token Counts

Issue: Tenant Isolation Failures

References