# BYAN v2 Workers - Lightweight LLM Agents

**Version:** 2.0.0  
**Last Updated:** 2026-02-10  
**Status:** ✅ Production Ready

---

## ⚠️ IMPORTANT: Worker vs Module

**WORKERS** = Petits agents LLM légers (Haiku, gpt-5-mini)  
**MODULES** = Code technique (Context, Dispatcher, Generation, etc.)

**Ne confondez pas les deux !**

---

## Qu'est-ce qu'un Worker ?

Un **Worker** est un **petit agent LLM** optimisé pour des tâches simples et répétitives.

### Caractéristiques

```
┌────────────────────┬──────────────┬──────────────┐
│ Feature            │ Worker       │ Agent        │
├────────────────────┼──────────────┼──────────────┤
│ Model              │ Haiku/Mini   │ Sonnet/Opus  │
│ Cost               │ 0.0003$/call │ 0.003$/call  │
│ Complexity Score   │ < 30         │ ≥ 60         │
│ Task Type          │ Simple       │ Complex      │
│ Context Window     │ Small        │ Large        │
│ Response Time      │ Fast         │ Slower       │
│ Intelligence       │ Low          │ High         │
└────────────────────┴──────────────┴──────────────┘
```

### Économie

```javascript
// Scénario : 100 tâches/semaine
// 100% Agent (Sonnet) = 100 × 0.003$ = 0.30$
// 60% Worker + 40% Agent = (60 × 0.0003$) + (40 × 0.003$) = 0.138$
// 
// Économie : 54% réduction de coût
```

---

## Architecture BYAN v2

### Dispatcher Rule-Based

Le **Dispatcher** (module technique, PAS un worker) analyse la complexité de chaque tâche et route :

```
┌─────────────────────────────────────────────────┐
│            TASK ARRIVES                         │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│        DISPATCHER (Code Module)                 │
│  Analyse complexité → Score 0-100               │
└────────────────┬────────────────────────────────┘
                 │
        ┌────────┴────────┐
        │                 │
        ▼                 ▼
┌──────────────┐  ┌──────────────┐
│  Score < 30  │  │  Score ≥ 60  │
│              │  │              │
│   WORKER     │  │    AGENT     │
│   (Cheap)    │  │  (Expensive) │
│              │  │              │
│ gpt-5-mini   │  │ claude-sonnet│
│ 0.0003$      │  │ 0.003$       │
└──────────────┘  └──────────────┘
```

### Worker Pool

```javascript
// Pool de 2 workers LLM
WorkerPool (size=2)
  ├── Worker #0 (idle/busy) → Model: gpt-5-mini
  └── Worker #1 (idle/busy) → Model: gpt-5-mini
```

**Gestion automatique :**
- Allocation du worker disponible
- File d'attente si tous occupés
- Fallback vers Agent si worker échoue
- Retry logic avec backoff

---

## Types de Workers

### 1. Task Workers (Dans Worker Pool)

**Utilisation :** Tâches génériques simples

**Exemples :**
- Format JSON
- Extraire données structurées
- Valider format
- Traductions simples
- Résumés courts

**Routing :**
```javascript
// Dispatcher analyse complexité
if (complexityScore < 30) {
  executeWithWorker(task);
} else {
  executeWithAgent(task);
}
```

---

### 2. Launcher Workers (Platform-Specific)

**Utilisation :** Lancer yanstaller sur chaque plateforme

**Fichiers :**
- `_byan/workers/launchers/launch-yanstaller-copilot.md`
- `_byan/workers/launchers/launch-yanstaller-claude.md`
- `_byan/workers/launchers/launch-yanstaller-codex.md`

**Caractéristiques :**
- Single task (execute `npx create-byan-agent`)
- No LLM call (just shell command)
- Ultra-light (< 5 KB)
- Idempotent
- Platform hints via env vars

**Architecture :**
```
User → Stub Agent → Launcher Worker → Yanstaller Agent
```

---

## Worker Pool Implementation

### Code Location

`src/core/worker-pool/worker-pool.js`

### API

```javascript
class WorkerPool {
  /**
   * @param {number} size - Number of workers (default: 2)
   * @param {Object} options
   * @param {string} options.model - LLM model (default: 'gpt-5-mini')
   * @param {number} options.timeout - Timeout in ms (default: 30000)
   */
  constructor(size = 2, options = {}) {
    this.size = size;
    this.model = options.model || 'gpt-5-mini';
    this.workers = [];
    this.queue = [];
    this.stats = {
      total: 0,
      success: 0,
      failed: 0,
      fallbackToAgent: 0
    };
  }

  /**
   * Execute task with next available worker
   * @param {Object} task
   * @returns {Promise<Object>}
   */
  async executeTask(task) {
    const worker = await this.getAvailableWorker();
    
    try {
      const result = await worker.execute(task);
      this.stats.success++;
      return result;
    } catch (error) {
      // Fallback to Agent if configured
      if (task.fallbackToAgent) {
        return await this.fallbackToAgent(task);
      }
      throw error;
    } finally {
      worker.release();
    }
  }

  /**
   * Get next available worker
   * @returns {Promise<Worker>}
   */
  async getAvailableWorker() {
    // Check for idle worker
    const idle = this.workers.find(w => w.isIdle());
    if (idle) {
      idle.markBusy();
      return idle;
    }

    // Queue if all busy
    return new Promise((resolve) => {
      this.queue.push(resolve);
    });
  }

  /**
   * Fallback to Agent for complex tasks
   * @param {Object} task
   * @returns {Promise<Object>}
   */
  async fallbackToAgent(task) {
    console.log(`Worker failed, falling back to Agent for task ${task.id}`);
    this.stats.fallbackToAgent++;
    
    // Use more powerful model
    const agent = new Agent({ model: 'claude-sonnet-4' });
    return await agent.execute(task);
  }
}
```

---

## Complexity Scoring

### Algorithm

```javascript
function calculateComplexity(task) {
  let score = 0;

  // Input length
  if (task.input.length > 1000) score += 20;
  else if (task.input.length > 500) score += 10;

  // Task type
  const complexTypes = ['analysis', 'reasoning', 'creation'];
  if (complexTypes.includes(task.type)) score += 30;

  // Context required
  if (task.contextSize > 5000) score += 20;

  // Multi-step
  if (task.steps && task.steps.length > 3) score += 15;

  // Output structure
  if (task.outputFormat === 'complex') score += 15;

  return score;
}
```

### Routing Logic

```javascript
const score = calculateComplexity(task);

if (score < 30) {
  // Simple task → Worker (gpt-5-mini)
  await workerPool.executeTask(task);
} else if (score < 60) {
  // Medium task → Worker with Agent fallback
  await workerPool.executeTask({
    ...task,
    fallbackToAgent: true
  });
} else {
  // Complex task → Agent directly (claude-sonnet)
  await agent.execute(task);
}
```

---

## Worker Lifecycle

### States

```
IDLE → BUSY → IDLE (success)
       ↓
     FAILED → RETRY → IDLE (success)
                ↓
              FALLBACK_TO_AGENT
```

### Flow

```javascript
// 1. Worker allocated from pool
const worker = await pool.getAvailableWorker();

// 2. Worker executes task
worker.markBusy();
const result = await worker.callLLM(task);

// 3. Worker released back to pool
worker.markIdle();
pool.releaseWorker(worker);

// 4. If failed and fallback enabled
if (failed && task.fallbackToAgent) {
  const agent = new Agent();
  return await agent.execute(task);
}
```

---

## Monitoring & Observability

### Metrics Tracked

```javascript
{
  total: 1000,              // Total tasks
  success: 850,             // Successful
  failed: 50,               // Failed
  fallbackToAgent: 100,     // Escalated to agent
  avgDuration: 1200,        // Average ms
  totalCost: 0.255,         // Total $
  avgCostPerTask: 0.000255  // Average $/task
}
```

### Cost Breakdown

```javascript
// Worker calls
workerCalls = 850;
workerCost = 850 × 0.0003$ = 0.255$;

// Agent fallback calls
agentCalls = 100;
agentCost = 100 × 0.003$ = 0.30$;

// Total
totalCost = 0.255$ + 0.30$ = 0.555$;

// vs All Agent approach
allAgentCost = 950 × 0.003$ = 2.85$;

// Savings
savings = (2.85$ - 0.555$) / 2.85$ = 80.5%
```

---

## Configuration

### Worker Pool Config

```yaml
# config/worker-pool.yaml
workerPool:
  size: 2
  model: gpt-5-mini
  timeout: 30000
  maxRetries: 3
  retryDelay: 1000
  
  fallback:
    enabled: true
    model: claude-sonnet-4
    threshold: 2  # Fallback after 2 failures
```

### Model Selection

```javascript
// Available models for workers
const WORKER_MODELS = {
  'gpt-5-mini': {
    cost: 0.0003,
    contextWindow: 128000,
    speed: 'fast'
  },
  'claude-haiku': {
    cost: 0.0005,
    contextWindow: 200000,
    speed: 'fast'
  }
};

// Available models for agents
const AGENT_MODELS = {
  'claude-sonnet-4': {
    cost: 0.003,
    contextWindow: 200000,
    speed: 'medium'
  },
  'claude-opus-4': {
    cost: 0.015,
    contextWindow: 200000,
    speed: 'slow'
  }
};
```

---

## Best Practices

### When to Use Workers

✅ **Use Workers for:**
- Format validation
- Data extraction
- Simple transformations
- JSON parsing/generation
- String operations
- Template filling

❌ **Don't Use Workers for:**
- Complex reasoning
- Multi-step analysis
- Code generation
- Architecture design
- Creative writing
- Decision making

### Optimization Tips

1. **Tune complexity thresholds** based on your use cases
2. **Monitor fallback rate** - high rate means threshold too low
3. **Batch simple tasks** to maximize worker utilization
4. **Cache common results** to avoid redundant calls
5. **Use async/await** properly to avoid blocking

---

## File Structure

```
_byan/
├── workers.md (this file)
└── workers/
    └── launchers/
        ├── README.md
        ├── launch-yanstaller-copilot.md
        ├── launch-yanstaller-claude.md
        └── launch-yanstaller-codex.md

src/
└── core/
    └── worker-pool/
        ├── worker-pool.js
        ├── worker.js
        └── complexity-scorer.js
```

---

## Related Documentation

- **Worker Pool Implementation:** `src/core/worker-pool/worker-pool.js`
- **Dispatcher Logic:** `src/byan-v2/dispatcher/`
- **Launcher Workers:** `_byan/workers/launchers/README.md`
- **Architecture:** `_bmad-output/conception/01-vision-et-principes.md`

---

## Summary

**Workers = Lightweight LLM agents for simple tasks**

```
┌─────────────────────────────────────────────────┐
│            BYAN v2 Task Execution               │
│                                                 │
│  Simple Task (score < 30)                       │
│    → Worker Pool (gpt-5-mini, 0.0003$)         │
│                                                 │
│  Medium Task (30 ≤ score < 60)                  │
│    → Worker Pool with Agent fallback            │
│                                                 │
│  Complex Task (score ≥ 60)                      │
│    → Agent directly (claude-sonnet, 0.003$)     │
│                                                 │
│  Result: 54-80% cost reduction                  │
└─────────────────────────────────────────────────┘
```

**Key Principle:** Right model for right task = optimal cost/performance

---

**Maintainer:** BYAN Core Team  
**Version:** 2.0.0  
**Status:** ✅ Production Ready