Corentin Joguet bff653acd6 first commit

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-15 10:30:37 +02:00

12 KiB

Raw Permalink Blame History

BYAN v2 Workers - Lightweight LLM Agents

Version: 2.0.0
Last Updated: 2026-02-10
Status: ✅ Production Ready

⚠️ IMPORTANT: Worker vs Module

WORKERS = Petits agents LLM légers (Haiku, gpt-5-mini)
MODULES = Code technique (Context, Dispatcher, Generation, etc.)

Ne confondez pas les deux !

Qu'est-ce qu'un Worker ?

Un Worker est un petit agent LLM optimisé pour des tâches simples et répétitives.

Caractéristiques

┌────────────────────┬──────────────┬──────────────┐
│ Feature            │ Worker       │ Agent        │
├────────────────────┼──────────────┼──────────────┤
│ Model              │ Haiku/Mini   │ Sonnet/Opus  │
│ Cost               │ 0.0003$/call │ 0.003$/call  │
│ Complexity Score   │ < 30         │ ≥ 60         │
│ Task Type          │ Simple       │ Complex      │
│ Context Window     │ Small        │ Large        │
│ Response Time      │ Fast         │ Slower       │
│ Intelligence       │ Low          │ High         │
└────────────────────┴──────────────┴──────────────┘

Économie

// Scénario : 100 tâches/semaine
// 100% Agent (Sonnet) = 100 × 0.003$ = 0.30$
// 60% Worker + 40% Agent = (60 × 0.0003$) + (40 × 0.003$) = 0.138$
// 
// Économie : 54% réduction de coût

Architecture BYAN v2

Dispatcher Rule-Based

Le Dispatcher (module technique, PAS un worker) analyse la complexité de chaque tâche et route :

┌─────────────────────────────────────────────────┐
│            TASK ARRIVES                         │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│        DISPATCHER (Code Module)                 │
│  Analyse complexité → Score 0-100               │
└────────────────┬────────────────────────────────┘
                 │
        ┌────────┴────────┐
        │                 │
        ▼                 ▼
┌──────────────┐  ┌──────────────┐
│  Score < 30  │  │  Score ≥ 60  │
│              │  │              │
│   WORKER     │  │    AGENT     │
│   (Cheap)    │  │  (Expensive) │
│              │  │              │
│ gpt-5-mini   │  │ claude-sonnet│
│ 0.0003$      │  │ 0.003$       │
└──────────────┘  └──────────────┘

Worker Pool

// Pool de 2 workers LLM
WorkerPool (size=2)
  ├── Worker #0 (idle/busy) → Model: gpt-5-mini
  └── Worker #1 (idle/busy) → Model: gpt-5-mini

Gestion automatique :

Allocation du worker disponible
File d'attente si tous occupés
Fallback vers Agent si worker échoue
Retry logic avec backoff

Types de Workers

1. Task Workers (Dans Worker Pool)

Utilisation : Tâches génériques simples

Exemples :

Format JSON
Extraire données structurées
Valider format
Traductions simples
Résumés courts

Routing :

// Dispatcher analyse complexité
if (complexityScore < 30) {
  executeWithWorker(task);
} else {
  executeWithAgent(task);
}

2. Launcher Workers (Platform-Specific)

Utilisation : Lancer yanstaller sur chaque plateforme

Fichiers :

_byan/workers/launchers/launch-yanstaller-copilot.md
_byan/workers/launchers/launch-yanstaller-claude.md
_byan/workers/launchers/launch-yanstaller-codex.md

Caractéristiques :

Single task (execute npx create-byan-agent)
No LLM call (just shell command)
Ultra-light (< 5 KB)
Idempotent
Platform hints via env vars

Architecture :

User → Stub Agent → Launcher Worker → Yanstaller Agent

Worker Pool Implementation

Code Location

src/core/worker-pool/worker-pool.js

API

class WorkerPool {
  /**
   * @param {number} size - Number of workers (default: 2)
   * @param {Object} options
   * @param {string} options.model - LLM model (default: 'gpt-5-mini')
   * @param {number} options.timeout - Timeout in ms (default: 30000)
   */
  constructor(size = 2, options = {}) {
    this.size = size;
    this.model = options.model || 'gpt-5-mini';
    this.workers = [];
    this.queue = [];
    this.stats = {
      total: 0,
      success: 0,
      failed: 0,
      fallbackToAgent: 0
    };
  }

  /**
   * Execute task with next available worker
   * @param {Object} task
   * @returns {Promise<Object>}
   */
  async executeTask(task) {
    const worker = await this.getAvailableWorker();
    
    try {
      const result = await worker.execute(task);
      this.stats.success++;
      return result;
    } catch (error) {
      // Fallback to Agent if configured
      if (task.fallbackToAgent) {
        return await this.fallbackToAgent(task);
      }
      throw error;
    } finally {
      worker.release();
    }
  }

  /**
   * Get next available worker
   * @returns {Promise<Worker>}
   */
  async getAvailableWorker() {
    // Check for idle worker
    const idle = this.workers.find(w => w.isIdle());
    if (idle) {
      idle.markBusy();
      return idle;
    }

    // Queue if all busy
    return new Promise((resolve) => {
      this.queue.push(resolve);
    });
  }

  /**
   * Fallback to Agent for complex tasks
   * @param {Object} task
   * @returns {Promise<Object>}
   */
  async fallbackToAgent(task) {
    console.log(`Worker failed, falling back to Agent for task ${task.id}`);
    this.stats.fallbackToAgent++;
    
    // Use more powerful model
    const agent = new Agent({ model: 'claude-sonnet-4' });
    return await agent.execute(task);
  }
}

Complexity Scoring

Algorithm

function calculateComplexity(task) {
  let score = 0;

  // Input length
  if (task.input.length > 1000) score += 20;
  else if (task.input.length > 500) score += 10;

  // Task type
  const complexTypes = ['analysis', 'reasoning', 'creation'];
  if (complexTypes.includes(task.type)) score += 30;

  // Context required
  if (task.contextSize > 5000) score += 20;

  // Multi-step
  if (task.steps && task.steps.length > 3) score += 15;

  // Output structure
  if (task.outputFormat === 'complex') score += 15;

  return score;
}

Routing Logic

const score = calculateComplexity(task);

if (score < 30) {
  // Simple task → Worker (gpt-5-mini)
  await workerPool.executeTask(task);
} else if (score < 60) {
  // Medium task → Worker with Agent fallback
  await workerPool.executeTask({
    ...task,
    fallbackToAgent: true
  });
} else {
  // Complex task → Agent directly (claude-sonnet)
  await agent.execute(task);
}

Worker Lifecycle

States

IDLE → BUSY → IDLE (success)
       ↓
     FAILED → RETRY → IDLE (success)
                ↓
              FALLBACK_TO_AGENT

Flow

// 1. Worker allocated from pool
const worker = await pool.getAvailableWorker();

// 2. Worker executes task
worker.markBusy();
const result = await worker.callLLM(task);

// 3. Worker released back to pool
worker.markIdle();
pool.releaseWorker(worker);

// 4. If failed and fallback enabled
if (failed && task.fallbackToAgent) {
  const agent = new Agent();
  return await agent.execute(task);
}

Monitoring & Observability

Metrics Tracked

{
  total: 1000,              // Total tasks
  success: 850,             // Successful
  failed: 50,               // Failed
  fallbackToAgent: 100,     // Escalated to agent
  avgDuration: 1200,        // Average ms
  totalCost: 0.255,         // Total $
  avgCostPerTask: 0.000255  // Average $/task
}

Cost Breakdown

// Worker calls
workerCalls = 850;
workerCost = 850 × 0.0003$ = 0.255$;

// Agent fallback calls
agentCalls = 100;
agentCost = 100 × 0.003$ = 0.30$;

// Total
totalCost = 0.255$ + 0.30$ = 0.555$;

// vs All Agent approach
allAgentCost = 950 × 0.003$ = 2.85$;

// Savings
savings = (2.85$ - 0.555$) / 2.85$ = 80.5%

Configuration

Worker Pool Config

# config/worker-pool.yaml
workerPool:
  size: 2
  model: gpt-5-mini
  timeout: 30000
  maxRetries: 3
  retryDelay: 1000
  
  fallback:
    enabled: true
    model: claude-sonnet-4
    threshold: 2  # Fallback after 2 failures

Model Selection

// Available models for workers
const WORKER_MODELS = {
  'gpt-5-mini': {
    cost: 0.0003,
    contextWindow: 128000,
    speed: 'fast'
  },
  'claude-haiku': {
    cost: 0.0005,
    contextWindow: 200000,
    speed: 'fast'
  }
};

// Available models for agents
const AGENT_MODELS = {
  'claude-sonnet-4': {
    cost: 0.003,
    contextWindow: 200000,
    speed: 'medium'
  },
  'claude-opus-4': {
    cost: 0.015,
    contextWindow: 200000,
    speed: 'slow'
  }
};

Best Practices

When to Use Workers

✅ Use Workers for:

Format validation
Data extraction
Simple transformations
JSON parsing/generation
String operations
Template filling

❌ Don't Use Workers for:

Complex reasoning
Multi-step analysis
Code generation
Architecture design
Creative writing
Decision making

Optimization Tips

Tune complexity thresholds based on your use cases
Monitor fallback rate - high rate means threshold too low
Batch simple tasks to maximize worker utilization
Cache common results to avoid redundant calls
Use async/await properly to avoid blocking

File Structure

_byan/
├── workers.md (this file)
└── workers/
    └── launchers/
        ├── README.md
        ├── launch-yanstaller-copilot.md
        ├── launch-yanstaller-claude.md
        └── launch-yanstaller-codex.md

src/
└── core/
    └── worker-pool/
        ├── worker-pool.js
        ├── worker.js
        └── complexity-scorer.js

Worker Pool Implementation: src/core/worker-pool/worker-pool.js
Dispatcher Logic: src/byan-v2/dispatcher/
Launcher Workers: _byan/workers/launchers/README.md
Architecture: _bmad-output/conception/01-vision-et-principes.md

Summary

Workers = Lightweight LLM agents for simple tasks

┌─────────────────────────────────────────────────┐
│            BYAN v2 Task Execution               │
│                                                 │
│  Simple Task (score < 30)                       │
│    → Worker Pool (gpt-5-mini, 0.0003$)         │
│                                                 │
│  Medium Task (30 ≤ score < 60)                  │
│    → Worker Pool with Agent fallback            │
│                                                 │
│  Complex Task (score ≥ 60)                      │
│    → Agent directly (claude-sonnet, 0.003$)     │
│                                                 │
│  Result: 54-80% cost reduction                  │
└─────────────────────────────────────────────────┘

Key Principle: Right model for right task = optimal cost/performance

Maintainer: BYAN Core Team
Version: 2.0.0
Status: ✅ Production Ready

12 KiB Raw Permalink Blame History Unescape Escape