site-mariage/_byan/workers.md
Corentin Joguet bff653acd6 first commit
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 10:30:37 +02:00

500 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# BYAN v2 Workers - Lightweight LLM Agents
**Version:** 2.0.0
**Last Updated:** 2026-02-10
**Status:** ✅ Production Ready
---
## ⚠️ IMPORTANT: Worker vs Module
**WORKERS** = Petits agents LLM légers (Haiku, gpt-5-mini)
**MODULES** = Code technique (Context, Dispatcher, Generation, etc.)
**Ne confondez pas les deux !**
---
## Qu'est-ce qu'un Worker ?
Un **Worker** est un **petit agent LLM** optimisé pour des tâches simples et répétitives.
### Caractéristiques
```
┌────────────────────┬──────────────┬──────────────┐
│ Feature │ Worker │ Agent │
├────────────────────┼──────────────┼──────────────┤
│ Model │ Haiku/Mini │ Sonnet/Opus │
│ Cost │ 0.0003$/call │ 0.003$/call │
│ Complexity Score │ < 30 │ ≥ 60 │
│ Task Type │ Simple │ Complex │
│ Context Window │ Small │ Large │
│ Response Time │ Fast │ Slower │
│ Intelligence │ Low │ High │
└────────────────────┴──────────────┴──────────────┘
```
### Économie
```javascript
// Scénario : 100 tâches/semaine
// 100% Agent (Sonnet) = 100 × 0.003$ = 0.30$
// 60% Worker + 40% Agent = (60 × 0.0003$) + (40 × 0.003$) = 0.138$
//
// Économie : 54% réduction de coût
```
---
## Architecture BYAN v2
### Dispatcher Rule-Based
Le **Dispatcher** (module technique, PAS un worker) analyse la complexité de chaque tâche et route :
```
┌─────────────────────────────────────────────────┐
│ TASK ARRIVES │
└────────────────┬────────────────────────────────┘
┌─────────────────────────────────────────────────┐
│ DISPATCHER (Code Module) │
│ Analyse complexité → Score 0-100 │
└────────────────┬────────────────────────────────┘
┌────────┴────────┐
│ │
▼ ▼
┌──────────────┐ ┌──────────────┐
│ Score < 30 │ │ Score ≥ 60 │
│ │ │ │
│ WORKER │ │ AGENT │
│ (Cheap) │ │ (Expensive) │
│ │ │ │
│ gpt-5-mini │ │ claude-sonnet│
│ 0.0003$ │ │ 0.003$ │
└──────────────┘ └──────────────┘
```
### Worker Pool
```javascript
// Pool de 2 workers LLM
WorkerPool (size=2)
├── Worker #0 (idle/busy) Model: gpt-5-mini
└── Worker #1 (idle/busy) Model: gpt-5-mini
```
**Gestion automatique :**
- Allocation du worker disponible
- File d'attente si tous occupés
- Fallback vers Agent si worker échoue
- Retry logic avec backoff
---
## Types de Workers
### 1. Task Workers (Dans Worker Pool)
**Utilisation :** Tâches génériques simples
**Exemples :**
- Format JSON
- Extraire données structurées
- Valider format
- Traductions simples
- Résumés courts
**Routing :**
```javascript
// Dispatcher analyse complexité
if (complexityScore < 30) {
executeWithWorker(task);
} else {
executeWithAgent(task);
}
```
---
### 2. Launcher Workers (Platform-Specific)
**Utilisation :** Lancer yanstaller sur chaque plateforme
**Fichiers :**
- `_byan/workers/launchers/launch-yanstaller-copilot.md`
- `_byan/workers/launchers/launch-yanstaller-claude.md`
- `_byan/workers/launchers/launch-yanstaller-codex.md`
**Caractéristiques :**
- Single task (execute `npx create-byan-agent`)
- No LLM call (just shell command)
- Ultra-light (< 5 KB)
- Idempotent
- Platform hints via env vars
**Architecture :**
```
User → Stub Agent → Launcher Worker → Yanstaller Agent
```
---
## Worker Pool Implementation
### Code Location
`src/core/worker-pool/worker-pool.js`
### API
```javascript
class WorkerPool {
/**
* @param {number} size - Number of workers (default: 2)
* @param {Object} options
* @param {string} options.model - LLM model (default: 'gpt-5-mini')
* @param {number} options.timeout - Timeout in ms (default: 30000)
*/
constructor(size = 2, options = {}) {
this.size = size;
this.model = options.model || 'gpt-5-mini';
this.workers = [];
this.queue = [];
this.stats = {
total: 0,
success: 0,
failed: 0,
fallbackToAgent: 0
};
}
/**
* Execute task with next available worker
* @param {Object} task
* @returns {Promise<Object>}
*/
async executeTask(task) {
const worker = await this.getAvailableWorker();
try {
const result = await worker.execute(task);
this.stats.success++;
return result;
} catch (error) {
// Fallback to Agent if configured
if (task.fallbackToAgent) {
return await this.fallbackToAgent(task);
}
throw error;
} finally {
worker.release();
}
}
/**
* Get next available worker
* @returns {Promise<Worker>}
*/
async getAvailableWorker() {
// Check for idle worker
const idle = this.workers.find(w => w.isIdle());
if (idle) {
idle.markBusy();
return idle;
}
// Queue if all busy
return new Promise((resolve) => {
this.queue.push(resolve);
});
}
/**
* Fallback to Agent for complex tasks
* @param {Object} task
* @returns {Promise<Object>}
*/
async fallbackToAgent(task) {
console.log(`Worker failed, falling back to Agent for task ${task.id}`);
this.stats.fallbackToAgent++;
// Use more powerful model
const agent = new Agent({ model: 'claude-sonnet-4' });
return await agent.execute(task);
}
}
```
---
## Complexity Scoring
### Algorithm
```javascript
function calculateComplexity(task) {
let score = 0;
// Input length
if (task.input.length > 1000) score += 20;
else if (task.input.length > 500) score += 10;
// Task type
const complexTypes = ['analysis', 'reasoning', 'creation'];
if (complexTypes.includes(task.type)) score += 30;
// Context required
if (task.contextSize > 5000) score += 20;
// Multi-step
if (task.steps && task.steps.length > 3) score += 15;
// Output structure
if (task.outputFormat === 'complex') score += 15;
return score;
}
```
### Routing Logic
```javascript
const score = calculateComplexity(task);
if (score < 30) {
// Simple task → Worker (gpt-5-mini)
await workerPool.executeTask(task);
} else if (score < 60) {
// Medium task → Worker with Agent fallback
await workerPool.executeTask({
...task,
fallbackToAgent: true
});
} else {
// Complex task → Agent directly (claude-sonnet)
await agent.execute(task);
}
```
---
## Worker Lifecycle
### States
```
IDLE → BUSY → IDLE (success)
FAILED → RETRY → IDLE (success)
FALLBACK_TO_AGENT
```
### Flow
```javascript
// 1. Worker allocated from pool
const worker = await pool.getAvailableWorker();
// 2. Worker executes task
worker.markBusy();
const result = await worker.callLLM(task);
// 3. Worker released back to pool
worker.markIdle();
pool.releaseWorker(worker);
// 4. If failed and fallback enabled
if (failed && task.fallbackToAgent) {
const agent = new Agent();
return await agent.execute(task);
}
```
---
## Monitoring & Observability
### Metrics Tracked
```javascript
{
total: 1000, // Total tasks
success: 850, // Successful
failed: 50, // Failed
fallbackToAgent: 100, // Escalated to agent
avgDuration: 1200, // Average ms
totalCost: 0.255, // Total $
avgCostPerTask: 0.000255 // Average $/task
}
```
### Cost Breakdown
```javascript
// Worker calls
workerCalls = 850;
workerCost = 850 × 0.0003$ = 0.255$;
// Agent fallback calls
agentCalls = 100;
agentCost = 100 × 0.003$ = 0.30$;
// Total
totalCost = 0.255$ + 0.30$ = 0.555$;
// vs All Agent approach
allAgentCost = 950 × 0.003$ = 2.85$;
// Savings
savings = (2.85$ - 0.555$) / 2.85$ = 80.5%
```
---
## Configuration
### Worker Pool Config
```yaml
# config/worker-pool.yaml
workerPool:
size: 2
model: gpt-5-mini
timeout: 30000
maxRetries: 3
retryDelay: 1000
fallback:
enabled: true
model: claude-sonnet-4
threshold: 2 # Fallback after 2 failures
```
### Model Selection
```javascript
// Available models for workers
const WORKER_MODELS = {
'gpt-5-mini': {
cost: 0.0003,
contextWindow: 128000,
speed: 'fast'
},
'claude-haiku': {
cost: 0.0005,
contextWindow: 200000,
speed: 'fast'
}
};
// Available models for agents
const AGENT_MODELS = {
'claude-sonnet-4': {
cost: 0.003,
contextWindow: 200000,
speed: 'medium'
},
'claude-opus-4': {
cost: 0.015,
contextWindow: 200000,
speed: 'slow'
}
};
```
---
## Best Practices
### When to Use Workers
**Use Workers for:**
- Format validation
- Data extraction
- Simple transformations
- JSON parsing/generation
- String operations
- Template filling
**Don't Use Workers for:**
- Complex reasoning
- Multi-step analysis
- Code generation
- Architecture design
- Creative writing
- Decision making
### Optimization Tips
1. **Tune complexity thresholds** based on your use cases
2. **Monitor fallback rate** - high rate means threshold too low
3. **Batch simple tasks** to maximize worker utilization
4. **Cache common results** to avoid redundant calls
5. **Use async/await** properly to avoid blocking
---
## File Structure
```
_byan/
├── workers.md (this file)
└── workers/
└── launchers/
├── README.md
├── launch-yanstaller-copilot.md
├── launch-yanstaller-claude.md
└── launch-yanstaller-codex.md
src/
└── core/
└── worker-pool/
├── worker-pool.js
├── worker.js
└── complexity-scorer.js
```
---
## Related Documentation
- **Worker Pool Implementation:** `src/core/worker-pool/worker-pool.js`
- **Dispatcher Logic:** `src/byan-v2/dispatcher/`
- **Launcher Workers:** `_byan/workers/launchers/README.md`
- **Architecture:** `_bmad-output/conception/01-vision-et-principes.md`
---
## Summary
**Workers = Lightweight LLM agents for simple tasks**
```
┌─────────────────────────────────────────────────┐
│ BYAN v2 Task Execution │
│ │
│ Simple Task (score < 30) │
│ → Worker Pool (gpt-5-mini, 0.0003$) │
│ │
│ Medium Task (30 ≤ score < 60) │
│ → Worker Pool with Agent fallback │
│ │
│ Complex Task (score ≥ 60) │
│ → Agent directly (claude-sonnet, 0.003$) │
│ │
│ Result: 54-80% cost reduction │
└─────────────────────────────────────────────────┘
```
**Key Principle:** Right model for right task = optimal cost/performance
---
**Maintainer:** BYAN Core Team
**Version:** 2.0.0
**Status:** Production Ready