12 KiB
BYAN v2 Workers - Lightweight LLM Agents
Version: 2.0.0
Last Updated: 2026-02-10
Status: ✅ Production Ready
⚠️ IMPORTANT: Worker vs Module
WORKERS = Petits agents LLM légers (Haiku, gpt-5-mini)
MODULES = Code technique (Context, Dispatcher, Generation, etc.)
Ne confondez pas les deux !
Qu'est-ce qu'un Worker ?
Un Worker est un petit agent LLM optimisé pour des tâches simples et répétitives.
Caractéristiques
┌────────────────────┬──────────────┬──────────────┐
│ Feature │ Worker │ Agent │
├────────────────────┼──────────────┼──────────────┤
│ Model │ Haiku/Mini │ Sonnet/Opus │
│ Cost │ 0.0003$/call │ 0.003$/call │
│ Complexity Score │ < 30 │ ≥ 60 │
│ Task Type │ Simple │ Complex │
│ Context Window │ Small │ Large │
│ Response Time │ Fast │ Slower │
│ Intelligence │ Low │ High │
└────────────────────┴──────────────┴──────────────┘
Économie
// Scénario : 100 tâches/semaine
// 100% Agent (Sonnet) = 100 × 0.003$ = 0.30$
// 60% Worker + 40% Agent = (60 × 0.0003$) + (40 × 0.003$) = 0.138$
//
// Économie : 54% réduction de coût
Architecture BYAN v2
Dispatcher Rule-Based
Le Dispatcher (module technique, PAS un worker) analyse la complexité de chaque tâche et route :
┌─────────────────────────────────────────────────┐
│ TASK ARRIVES │
└────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ DISPATCHER (Code Module) │
│ Analyse complexité → Score 0-100 │
└────────────────┬────────────────────────────────┘
│
┌────────┴────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Score < 30 │ │ Score ≥ 60 │
│ │ │ │
│ WORKER │ │ AGENT │
│ (Cheap) │ │ (Expensive) │
│ │ │ │
│ gpt-5-mini │ │ claude-sonnet│
│ 0.0003$ │ │ 0.003$ │
└──────────────┘ └──────────────┘
Worker Pool
// Pool de 2 workers LLM
WorkerPool (size=2)
├── Worker #0 (idle/busy) → Model: gpt-5-mini
└── Worker #1 (idle/busy) → Model: gpt-5-mini
Gestion automatique :
- Allocation du worker disponible
- File d'attente si tous occupés
- Fallback vers Agent si worker échoue
- Retry logic avec backoff
Types de Workers
1. Task Workers (Dans Worker Pool)
Utilisation : Tâches génériques simples
Exemples :
- Format JSON
- Extraire données structurées
- Valider format
- Traductions simples
- Résumés courts
Routing :
// Dispatcher analyse complexité
if (complexityScore < 30) {
executeWithWorker(task);
} else {
executeWithAgent(task);
}
2. Launcher Workers (Platform-Specific)
Utilisation : Lancer yanstaller sur chaque plateforme
Fichiers :
_byan/workers/launchers/launch-yanstaller-copilot.md_byan/workers/launchers/launch-yanstaller-claude.md_byan/workers/launchers/launch-yanstaller-codex.md
Caractéristiques :
- Single task (execute
npx create-byan-agent) - No LLM call (just shell command)
- Ultra-light (< 5 KB)
- Idempotent
- Platform hints via env vars
Architecture :
User → Stub Agent → Launcher Worker → Yanstaller Agent
Worker Pool Implementation
Code Location
src/core/worker-pool/worker-pool.js
API
class WorkerPool {
/**
* @param {number} size - Number of workers (default: 2)
* @param {Object} options
* @param {string} options.model - LLM model (default: 'gpt-5-mini')
* @param {number} options.timeout - Timeout in ms (default: 30000)
*/
constructor(size = 2, options = {}) {
this.size = size;
this.model = options.model || 'gpt-5-mini';
this.workers = [];
this.queue = [];
this.stats = {
total: 0,
success: 0,
failed: 0,
fallbackToAgent: 0
};
}
/**
* Execute task with next available worker
* @param {Object} task
* @returns {Promise<Object>}
*/
async executeTask(task) {
const worker = await this.getAvailableWorker();
try {
const result = await worker.execute(task);
this.stats.success++;
return result;
} catch (error) {
// Fallback to Agent if configured
if (task.fallbackToAgent) {
return await this.fallbackToAgent(task);
}
throw error;
} finally {
worker.release();
}
}
/**
* Get next available worker
* @returns {Promise<Worker>}
*/
async getAvailableWorker() {
// Check for idle worker
const idle = this.workers.find(w => w.isIdle());
if (idle) {
idle.markBusy();
return idle;
}
// Queue if all busy
return new Promise((resolve) => {
this.queue.push(resolve);
});
}
/**
* Fallback to Agent for complex tasks
* @param {Object} task
* @returns {Promise<Object>}
*/
async fallbackToAgent(task) {
console.log(`Worker failed, falling back to Agent for task ${task.id}`);
this.stats.fallbackToAgent++;
// Use more powerful model
const agent = new Agent({ model: 'claude-sonnet-4' });
return await agent.execute(task);
}
}
Complexity Scoring
Algorithm
function calculateComplexity(task) {
let score = 0;
// Input length
if (task.input.length > 1000) score += 20;
else if (task.input.length > 500) score += 10;
// Task type
const complexTypes = ['analysis', 'reasoning', 'creation'];
if (complexTypes.includes(task.type)) score += 30;
// Context required
if (task.contextSize > 5000) score += 20;
// Multi-step
if (task.steps && task.steps.length > 3) score += 15;
// Output structure
if (task.outputFormat === 'complex') score += 15;
return score;
}
Routing Logic
const score = calculateComplexity(task);
if (score < 30) {
// Simple task → Worker (gpt-5-mini)
await workerPool.executeTask(task);
} else if (score < 60) {
// Medium task → Worker with Agent fallback
await workerPool.executeTask({
...task,
fallbackToAgent: true
});
} else {
// Complex task → Agent directly (claude-sonnet)
await agent.execute(task);
}
Worker Lifecycle
States
IDLE → BUSY → IDLE (success)
↓
FAILED → RETRY → IDLE (success)
↓
FALLBACK_TO_AGENT
Flow
// 1. Worker allocated from pool
const worker = await pool.getAvailableWorker();
// 2. Worker executes task
worker.markBusy();
const result = await worker.callLLM(task);
// 3. Worker released back to pool
worker.markIdle();
pool.releaseWorker(worker);
// 4. If failed and fallback enabled
if (failed && task.fallbackToAgent) {
const agent = new Agent();
return await agent.execute(task);
}
Monitoring & Observability
Metrics Tracked
{
total: 1000, // Total tasks
success: 850, // Successful
failed: 50, // Failed
fallbackToAgent: 100, // Escalated to agent
avgDuration: 1200, // Average ms
totalCost: 0.255, // Total $
avgCostPerTask: 0.000255 // Average $/task
}
Cost Breakdown
// Worker calls
workerCalls = 850;
workerCost = 850 × 0.0003$ = 0.255$;
// Agent fallback calls
agentCalls = 100;
agentCost = 100 × 0.003$ = 0.30$;
// Total
totalCost = 0.255$ + 0.30$ = 0.555$;
// vs All Agent approach
allAgentCost = 950 × 0.003$ = 2.85$;
// Savings
savings = (2.85$ - 0.555$) / 2.85$ = 80.5%
Configuration
Worker Pool Config
# config/worker-pool.yaml
workerPool:
size: 2
model: gpt-5-mini
timeout: 30000
maxRetries: 3
retryDelay: 1000
fallback:
enabled: true
model: claude-sonnet-4
threshold: 2 # Fallback after 2 failures
Model Selection
// Available models for workers
const WORKER_MODELS = {
'gpt-5-mini': {
cost: 0.0003,
contextWindow: 128000,
speed: 'fast'
},
'claude-haiku': {
cost: 0.0005,
contextWindow: 200000,
speed: 'fast'
}
};
// Available models for agents
const AGENT_MODELS = {
'claude-sonnet-4': {
cost: 0.003,
contextWindow: 200000,
speed: 'medium'
},
'claude-opus-4': {
cost: 0.015,
contextWindow: 200000,
speed: 'slow'
}
};
Best Practices
When to Use Workers
✅ Use Workers for:
- Format validation
- Data extraction
- Simple transformations
- JSON parsing/generation
- String operations
- Template filling
❌ Don't Use Workers for:
- Complex reasoning
- Multi-step analysis
- Code generation
- Architecture design
- Creative writing
- Decision making
Optimization Tips
- Tune complexity thresholds based on your use cases
- Monitor fallback rate - high rate means threshold too low
- Batch simple tasks to maximize worker utilization
- Cache common results to avoid redundant calls
- Use async/await properly to avoid blocking
File Structure
_byan/
├── workers.md (this file)
└── workers/
└── launchers/
├── README.md
├── launch-yanstaller-copilot.md
├── launch-yanstaller-claude.md
└── launch-yanstaller-codex.md
src/
└── core/
└── worker-pool/
├── worker-pool.js
├── worker.js
└── complexity-scorer.js
Related Documentation
- Worker Pool Implementation:
src/core/worker-pool/worker-pool.js - Dispatcher Logic:
src/byan-v2/dispatcher/ - Launcher Workers:
_byan/workers/launchers/README.md - Architecture:
_bmad-output/conception/01-vision-et-principes.md
Summary
Workers = Lightweight LLM agents for simple tasks
┌─────────────────────────────────────────────────┐
│ BYAN v2 Task Execution │
│ │
│ Simple Task (score < 30) │
│ → Worker Pool (gpt-5-mini, 0.0003$) │
│ │
│ Medium Task (30 ≤ score < 60) │
│ → Worker Pool with Agent fallback │
│ │
│ Complex Task (score ≥ 60) │
│ → Agent directly (claude-sonnet, 0.003$) │
│ │
│ Result: 54-80% cost reduction │
└─────────────────────────────────────────────────┘
Key Principle: Right model for right task = optimal cost/performance
Maintainer: BYAN Core Team
Version: 2.0.0
Status: ✅ Production Ready