feat(workflows): create 5 BYAN workflows for agent collaboration
Some checks are pending
CI / Lint bridge (Biome) (push) Waiting to run
CI / Type-check bridge (push) Blocked by required conditions
CI / Tests unit bridge (push) Blocked by required conditions
CI / Tests integration bridge (push) Blocked by required conditions
CI / Security scan (push) Waiting to run
CI / Docker build + healthcheck (push) Blocked by required conditions
Some checks are pending
CI / Lint bridge (Biome) (push) Waiting to run
CI / Type-check bridge (push) Blocked by required conditions
CI / Tests unit bridge (push) Blocked by required conditions
CI / Tests integration bridge (push) Blocked by required conditions
CI / Security scan (push) Waiting to run
CI / Docker build + healthcheck (push) Blocked by required conditions
Workflows (playbooks markdown) pour orchestrer les 4 agents specialises : - README.md : index + conventions communes + integration BYAN web futur - build-story.md : cycle complet livrer 1 story Phase 2 (bridge-dev → bridge-tester → review → CI → deploy staging → validation metier) - sync-bidirec.md : sync Docmost ↔ Baserow event-driven (idempotence + anti-loop X-Bridge-Origin) - release.md : process release semver (E2E staging → tag → approval → deploy prod → watch 30min) - incident.md : SEV1/2/3 response + post-mortem blameless + runbooks - bump-deps.md : Dependabot PRs + major bumps + Docmost/Baserow upstream Chaque workflow specifie : trigger, acteurs (agents + humains), sequence ordonnee avec outputs, gates humains bloquants, rollback, comm templates. Workflows = playbooks declaratifs pour Claude main qui orchestre les agents via Agent tool sequentiel. A migrer plus tard vers BYAN web workflow runs quand le runtime BYAN sera fix. Equipe complete pour formation-hub : - 4 agents specialises (bridge-dev, bridge-tester, acadenice-devops, docmost-fork-dev) - 5 workflows orchestrant leur collaboration
This commit is contained in:
parent
b37220d432
commit
460f7effe0
6 changed files with 831 additions and 0 deletions
54
.claude/workflows/README.md
Normal file
54
.claude/workflows/README.md
Normal file
|
|
@ -0,0 +1,54 @@
|
||||||
|
# Workflows formation-hub
|
||||||
|
|
||||||
|
Orchestration des agents specialises (`bridge-dev`, `bridge-tester`, `acadenice-devops`, `docmost-fork-dev`) pour realiser les operations recurrentes du projet.
|
||||||
|
|
||||||
|
## Comment lire ces workflows
|
||||||
|
|
||||||
|
Chaque workflow `<nom>.md` decrit :
|
||||||
|
- **Trigger** : evenement qui declenche le workflow
|
||||||
|
- **Sequence** : etapes ordonnees avec acteur (agent ou humain) + output attendu
|
||||||
|
- **Gates** : points de validation humaine bloquants
|
||||||
|
- **Rollback** : scenarios d'echec + actions
|
||||||
|
- **Outputs** : artefacts produits
|
||||||
|
|
||||||
|
## Comment les declencher
|
||||||
|
|
||||||
|
**Manuellement** : tu me dis "lance WF BUILD pour story S-XX" et j'invoque les agents en sequence selon le workflow.
|
||||||
|
|
||||||
|
**Idealement (futur)** : creer ces workflows aussi dans BYAN web (`byan-bmb-workflow-builder`) pour avoir l'orchestration native + tracking runs. Pas encore fait — workflows actuels sont des **playbooks markdown**.
|
||||||
|
|
||||||
|
## Workflows disponibles
|
||||||
|
|
||||||
|
| Workflow | Trigger | Duree typique |
|
||||||
|
|----------|---------|---------------|
|
||||||
|
| [`build-story.md`](./build-story.md) | Nouvelle story Phase 2 a livrer | 1-3 jours |
|
||||||
|
| [`sync-bidirec.md`](./sync-bidirec.md) | Webhook Baserow OU action Docmost custom | < 5s par event |
|
||||||
|
| [`release.md`](./release.md) | Tag semver `v*` | 30 min + 30 min watch |
|
||||||
|
| [`incident.md`](./incident.md) | Alerte SEV1/2/3 detectee | depend severite |
|
||||||
|
| [`bump-deps.md`](./bump-deps.md) | Dependabot PR ou bump manuel | 1-2h |
|
||||||
|
|
||||||
|
## Principes communs a tous les workflows
|
||||||
|
|
||||||
|
- **Gates humains explicites** : un agent ne peut pas merger en main sans approbation Corentin (ou Yan)
|
||||||
|
- **Reproductibilite** : chaque workflow est testable en staging avant prod
|
||||||
|
- **Logs traces** : chaque etape loggue son output (qui a fait quoi, quand, resultat)
|
||||||
|
- **Idempotence** : re-running un workflow = pas de side effect indesirable
|
||||||
|
- **Rollback documente** : si etape N echoue, le workflow indique comment revenir
|
||||||
|
|
||||||
|
## Integration avec BYAN web
|
||||||
|
|
||||||
|
A terme, ces workflows pourront etre crees dans BYAN web :
|
||||||
|
- `byan-bmb-workflow-builder` skill pour les modeliser
|
||||||
|
- Workflow runs traces dans `byan_api_workflow_runs`
|
||||||
|
- Trigger via `byan_api_workflows_run` ou MCP
|
||||||
|
|
||||||
|
Pour l'instant, c'est moi (Claude main) qui orchestre via Agent tool sequentiel.
|
||||||
|
|
||||||
|
## Conventions agents communes
|
||||||
|
|
||||||
|
Tous les agents respectent :
|
||||||
|
- **Tao Acadenice** : direct, structures avec tirets, zero emoji, orientation solution
|
||||||
|
- **Conventions commits** : `type(scope): description` (feat/fix/docs/refactor/test/chore/ops/sec)
|
||||||
|
- **Branches courtes** : max 3j de vie
|
||||||
|
- **Code prod-like** : tests + lint + types + security gates
|
||||||
|
- **Pas de modif docs conception** sans ADR explicite
|
||||||
142
.claude/workflows/build-story.md
Normal file
142
.claude/workflows/build-story.md
Normal file
|
|
@ -0,0 +1,142 @@
|
||||||
|
# Workflow : BUILD STORY
|
||||||
|
|
||||||
|
Workflow pour livrer une **story** Phase 2 du plan Fast-App (cf `_byan-output/fast-app/formation-hub/cdcf-stories.json` et `plan.json`).
|
||||||
|
|
||||||
|
Equivalent BYAN-natif : Sprint Planning + FD (Feature Development) restreints a une story.
|
||||||
|
|
||||||
|
## Trigger
|
||||||
|
|
||||||
|
- Story selectionnee depuis `cdcf-stories.json` (S-01 a S-10 + futures)
|
||||||
|
- Corentin invoque ce workflow avec : `WF BUILD pour story S-XX`
|
||||||
|
|
||||||
|
## Acteurs
|
||||||
|
|
||||||
|
- **Corentin** (decisionnaire)
|
||||||
|
- **bridge-dev** (code metier)
|
||||||
|
- **bridge-tester** (tests + validation)
|
||||||
|
- **acadenice-devops** (deploy si push staging requis)
|
||||||
|
- (optionnel) **docmost-fork-dev** si la story implique frontend Docmost
|
||||||
|
|
||||||
|
## Sequence
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Pre-flight (Corentin)
|
||||||
|
- Lire la story (Connextra + Gherkin AC) dans cdcf-stories.json
|
||||||
|
- Identifier les UC + entites concernees (cf doc 11 + doc 06/07)
|
||||||
|
- Choisir branche : feat/<story-slug> depuis main
|
||||||
|
- Output : story comprise + branche creee
|
||||||
|
|
||||||
|
[2] Code (bridge-dev)
|
||||||
|
- Read brief + doc 19 + relevant Merise docs
|
||||||
|
- Implement la story (adapters, domain, routes selon besoins)
|
||||||
|
- Self-test local : npm test && npx biome ci . && npx tsc --noEmit
|
||||||
|
- Commit progressif : type(scope): description
|
||||||
|
- Output : code commit sur branche feat/
|
||||||
|
|
||||||
|
[3] Tests (bridge-tester)
|
||||||
|
- Lit les Gherkin AC de la story
|
||||||
|
- Ecrit unit tests Vitest (coverage >= 80% domain)
|
||||||
|
- Ecrit integration tests testcontainers si adapter modifie
|
||||||
|
- Run : npm run test:coverage
|
||||||
|
- Si gap coverage > 10% sous cible : alerte bridge-dev
|
||||||
|
- Output : tests verts + coverage report
|
||||||
|
|
||||||
|
[4] Gate user — Review (Corentin)
|
||||||
|
- Verifier que le diff implemente bien la story
|
||||||
|
- Tester manuellement si pertinent (curl bridge endpoint nouveau)
|
||||||
|
- 3 decisions :
|
||||||
|
* APPROVED : aller en [5]
|
||||||
|
* NEEDS_REWORK : retour [2] avec feedback precis
|
||||||
|
* BLOCKED : story retirage du sprint
|
||||||
|
- Output : decision documentee dans PR description
|
||||||
|
|
||||||
|
[5] Push selfhost + GitHub (Corentin OU bridge-dev avec admin override)
|
||||||
|
- git push selfhost feat/<branch>
|
||||||
|
- Open PR sur Forgejo
|
||||||
|
- Open same PR sur GitHub mirror si configure
|
||||||
|
- Output : PR ouverte, CI auto-trigger
|
||||||
|
|
||||||
|
[6] CI verification (acadenice-devops via CI/CD)
|
||||||
|
- Workflow ci.yml execute :
|
||||||
|
* Lint Biome
|
||||||
|
* Type-check tsc
|
||||||
|
* Tests unit + integration
|
||||||
|
* Security (TruffleHog + Semgrep + npm audit)
|
||||||
|
* Docker build healthcheck
|
||||||
|
- Si vert : continue [7]
|
||||||
|
- Si rouge : retour [2] avec logs d'echec
|
||||||
|
- Output : CI status
|
||||||
|
|
||||||
|
[7] Gate user — Merge (Corentin)
|
||||||
|
- Verifier review ok (1+ approval) + CI vert
|
||||||
|
- Squash merge vers main
|
||||||
|
- Auto-delete branch
|
||||||
|
- Output : commit sur main
|
||||||
|
|
||||||
|
[8] Deploy staging (acadenice-devops via deploy-staging.yml)
|
||||||
|
- Phase 0/1 : workflow_dispatch only (pas auto)
|
||||||
|
- Quand staging pret : auto sur push main
|
||||||
|
- Smoke test post-deploy
|
||||||
|
- Output : staging URL fonctionnelle
|
||||||
|
|
||||||
|
[9] Validation metier (Corentin + Yan + utilisateurs cibles)
|
||||||
|
- Tester le flow utilisateur en staging
|
||||||
|
- Si OK : passer a [10]
|
||||||
|
- Si KO : retour [2] avec issue ou hotfix branch
|
||||||
|
- Output : metier signe-off
|
||||||
|
|
||||||
|
[10] Mise a jour artefacts (bridge-dev OU Corentin)
|
||||||
|
- Update build-state.json (story S-XX completed)
|
||||||
|
- Update CHANGELOG.md (section Unreleased)
|
||||||
|
- Output : artefacts a jour
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gates humains bloquants
|
||||||
|
|
||||||
|
| Gate | Decision possible | Owner |
|
||||||
|
|------|-------------------|-------|
|
||||||
|
| Gate review (4) | APPROVED / NEEDS_REWORK / BLOCKED | Corentin |
|
||||||
|
| Gate merge (7) | APPROVED / WAIT_FIX_CI / BLOCKED | Corentin |
|
||||||
|
| Gate validation metier (9) | APPROVED / NEEDS_REWORK | Corentin + utilisateurs |
|
||||||
|
|
||||||
|
## Rollback
|
||||||
|
|
||||||
|
| Echec | Action |
|
||||||
|
|-------|--------|
|
||||||
|
| Etape [2] code casse local : | bridge-dev fix, retry |
|
||||||
|
| Etape [3] tests echouent : | bridge-tester explique + bridge-dev fix |
|
||||||
|
| Etape [6] CI rouge : | acadenice-devops ou bridge-dev fix selon job (lint/test/security) |
|
||||||
|
| Etape [8] staging deploy fail : | acadenice-devops investigue (logs SSH + healthcheck) |
|
||||||
|
| Etape [9] metier rejette : | Corentin decide : fix mineur (loop [2]) ou re-PRUNE story |
|
||||||
|
|
||||||
|
## Outputs
|
||||||
|
|
||||||
|
- Branch `feat/<story-slug>` mergee sur main (squash)
|
||||||
|
- Tests + coverage reports
|
||||||
|
- CHANGELOG.md a jour
|
||||||
|
- build-state.json a jour (story marked completed)
|
||||||
|
- Si applicable : staging URL fonctionnelle
|
||||||
|
|
||||||
|
## Exemple invocation (manuel)
|
||||||
|
|
||||||
|
```
|
||||||
|
Corentin : "Lance WF BUILD pour S-02 (Setup Baserow tables)"
|
||||||
|
|
||||||
|
Moi (Claude main) :
|
||||||
|
[1] Lis S-02 dans cdcf-stories.json. Verifies prereqs (compte admin Baserow OK).
|
||||||
|
[2] Invoque bridge-dev :
|
||||||
|
"Implemente S-02 : table PERSONNE Baserow avec 16 fields + formulas
|
||||||
|
heures_restantes selon doc 15 MPD. Branche feat/personne-table.
|
||||||
|
Commit + push selfhost en branche feature."
|
||||||
|
[3] Invoque bridge-tester :
|
||||||
|
"Ecris tests pour S-02. Verifie creation table + types fields + formulas.
|
||||||
|
Coverage minimum 80% sur le code touche."
|
||||||
|
[4] Reporter a Corentin pour review.
|
||||||
|
... etc
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Pour Phase 2 entiere : ce workflow tourne **par story** (10+ stories au total dans cdcf-stories.json)
|
||||||
|
- Estimation : 1-3 jours par story selon complexite (cf `expected_loops` dans plan.json)
|
||||||
|
- Le user peut choisir d'enchainer plusieurs stories sans gates intermediaires si confiance haute
|
||||||
152
.claude/workflows/bump-deps.md
Normal file
152
.claude/workflows/bump-deps.md
Normal file
|
|
@ -0,0 +1,152 @@
|
||||||
|
# Workflow : BUMP DEPENDENCIES
|
||||||
|
|
||||||
|
Process de mise a jour des dependances (Dependabot PRs, bumps manuels, CVE security fixes).
|
||||||
|
|
||||||
|
## Trigger
|
||||||
|
|
||||||
|
L'un des suivants :
|
||||||
|
- Dependabot PR auto (configure dans `.github/dependabot.yml`)
|
||||||
|
- CVE alert GitHub Security
|
||||||
|
- Bump manuel decide (ex: passer Docmost de v0.8.x a v0.9.x)
|
||||||
|
- Cron mensuel review (Corentin oncall)
|
||||||
|
|
||||||
|
## Acteurs
|
||||||
|
|
||||||
|
- **acadenice-devops** (orchestrateur)
|
||||||
|
- **bridge-tester** (validation post-bump)
|
||||||
|
- **bridge-dev** (fix si breaking change dans deps)
|
||||||
|
- **Corentin** (decideur sur bumps majeurs)
|
||||||
|
|
||||||
|
## Categories de bumps
|
||||||
|
|
||||||
|
| Type | Frequence | Process |
|
||||||
|
|------|-----------|---------|
|
||||||
|
| **Security patch** (CVE high/critical) | ASAP | Auto Dependabot + auto-merge si CI vert |
|
||||||
|
| **Patch** (1.2.3 → 1.2.4) | Hebdo | Auto Dependabot + review 5 min + merge |
|
||||||
|
| **Minor** (1.2.x → 1.3.0) | Hebdo | Auto Dependabot + review + tests + merge |
|
||||||
|
| **Major** (1.x.x → 2.0.0) | Manuel | Branche feat dediee, test exhaustif, decision Corentin |
|
||||||
|
| **Docmost upstream** | Mensuel ou sur signal Yan/Corentin | Process specifique fork (cf docmost-fork-dev) |
|
||||||
|
| **Baserow upstream** | Mensuel ou sur changelog interessant | Pin nouvelle version, test compose, deploy staging |
|
||||||
|
| **Postgres major** | Annuel max, planifie | Backup obligatoire + migration test + restore + deploy carefull |
|
||||||
|
| **Node LTS** | Tous les 2 ans (changement LTS) | Test exhaustif bridge, possible refactor |
|
||||||
|
|
||||||
|
## Sequence — Patch / Minor (auto Dependabot)
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Dependabot PR cree (auto, hebdo lundi 06:00)
|
||||||
|
- Configure dans .github/dependabot.yml
|
||||||
|
- PR avec changelog du package + diff
|
||||||
|
- Output : PR ouverte sur Forgejo + GitHub mirror
|
||||||
|
|
||||||
|
[2] CI auto execute
|
||||||
|
- Workflow ci.yml lance sur la PR
|
||||||
|
- Tests + lint + security scan + docker build
|
||||||
|
- Output : CI status
|
||||||
|
|
||||||
|
[3] Review humaine (Corentin, 5-10 min)
|
||||||
|
- Lire le changelog du package
|
||||||
|
- Verifier impact potentiel
|
||||||
|
- Si nouveau type / breaking : check tests
|
||||||
|
- Output : decision merge / hold / close
|
||||||
|
|
||||||
|
[4] Si CI vert + review OK : merge (squash)
|
||||||
|
- Auto-delete branch
|
||||||
|
- Output : commit sur main
|
||||||
|
|
||||||
|
[5] Deploy auto staging (workflow deploy-staging.yml)
|
||||||
|
- Phase 0/1 : workflow_dispatch only
|
||||||
|
- Phase 2+ : auto sur push main
|
||||||
|
- Output : staging fonctionnel ou alerte si fail
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sequence — Major (manuel)
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Decision (Corentin)
|
||||||
|
- Lire le changelog upgrade guide officiel du package
|
||||||
|
- Identifier breaking changes
|
||||||
|
- Decider : on bump ou on attend
|
||||||
|
- Output : go/no-go
|
||||||
|
|
||||||
|
[2] Branche feat (bridge-dev)
|
||||||
|
- feat/bump-<package>-vX.Y
|
||||||
|
- Bump dans package.json
|
||||||
|
- npm install + commit lockfile
|
||||||
|
- Output : branche avec bump
|
||||||
|
|
||||||
|
[3] Migration code (bridge-dev)
|
||||||
|
- Adapter le code aux breaking changes
|
||||||
|
- Run tests : npm test
|
||||||
|
- Fix iteratif jusqu'a vert
|
||||||
|
- Output : code adapte
|
||||||
|
|
||||||
|
[4] Tests exhaustifs (bridge-tester)
|
||||||
|
- Run unit + integration : npm test
|
||||||
|
- Run E2E sur staging si Phase 2.3+
|
||||||
|
- Verifier coverage maintenu (>= 80% domain)
|
||||||
|
- Output : test report
|
||||||
|
|
||||||
|
[5] Validation staging (Corentin)
|
||||||
|
- Deploy staging
|
||||||
|
- Tester flows critiques
|
||||||
|
- Output : sign-off staging
|
||||||
|
|
||||||
|
[6] PR + merge (cf workflow build-story.md etapes [4]-[7])
|
||||||
|
|
||||||
|
[7] Deploy prod (cf workflow release.md)
|
||||||
|
- Suit le process release standard avec watch period
|
||||||
|
- Output : prod deployee
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sequence — Docmost / Baserow upstream
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Detect new version (Corentin via GitHub release watch)
|
||||||
|
[2] Lire release notes officielles
|
||||||
|
[3] Test sur env clone : pull image + restore data backup → smoke
|
||||||
|
[4] Si OK : update compose.yml ou Dockerfile.fork
|
||||||
|
[5] Process release standard (cf release.md)
|
||||||
|
[6] Si KO : reporter au upstream (issue) ou attendre prochaine release
|
||||||
|
```
|
||||||
|
|
||||||
|
Cf workflow BYAN `docker-stack-safe-upgrade` (id `75abc7aa-8ba7-47ce-b6b8-bf5573e82f62`) pour les bumps stateful en prod (12 phases avec gates).
|
||||||
|
|
||||||
|
## Gates humains
|
||||||
|
|
||||||
|
| Gate | Decision | Owner |
|
||||||
|
|------|----------|-------|
|
||||||
|
| Review Dependabot PR (3) | merge / hold / close | Corentin |
|
||||||
|
| Decision major (1) | go / no-go | Corentin |
|
||||||
|
| Validation staging (5) | OK / RETOUR | Corentin |
|
||||||
|
|
||||||
|
## Rollback / gestion d'erreurs
|
||||||
|
|
||||||
|
| Scenario | Action |
|
||||||
|
|----------|--------|
|
||||||
|
| CI rouge sur Dependabot PR | hold PR, analyser logs, decider fix ou close |
|
||||||
|
| Major bump introduit regression non detectee en CI | rollback (revert commit + redeploy) + add regression test |
|
||||||
|
| Docmost upgrade casse data | restore backup pre-upgrade + downgrade image + investigate |
|
||||||
|
|
||||||
|
## Frequence et planning
|
||||||
|
|
||||||
|
- **Lundi matin** : review Dependabot PRs (15-30 min Corentin)
|
||||||
|
- **1er du mois** : audit security alerts + capacity planning + DR test
|
||||||
|
- **Trimestriel** : review major bumps possibles (Node, Postgres, Hono, Tiptap, etc.)
|
||||||
|
|
||||||
|
## Outputs
|
||||||
|
|
||||||
|
- package.json + lock file a jour
|
||||||
|
- CI vert post-bump
|
||||||
|
- Tests + coverage maintenus
|
||||||
|
- CHANGELOG.md update si user-facing
|
||||||
|
- Si major bump : doc migration interne dans `docs/migrations/<package>-vX.md`
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Dependabot configure dans `.github/dependabot.yml` (deja fait) :
|
||||||
|
* Ecosystem npm (bridge/) : weekly
|
||||||
|
* Ecosystem github-actions : weekly
|
||||||
|
* Ecosystem docker (compose) : weekly
|
||||||
|
- Limite open PRs Dependabot : 10 max (eviter spam)
|
||||||
|
- Group production-deps + dev-deps separement
|
||||||
|
- **Pas de bump prod le vendredi** (tradition + meme reason que release)
|
||||||
193
.claude/workflows/incident.md
Normal file
193
.claude/workflows/incident.md
Normal file
|
|
@ -0,0 +1,193 @@
|
||||||
|
# Workflow : INCIDENT RESPONSE
|
||||||
|
|
||||||
|
Process de gestion d'incident en prod. Cf doc 18 section 9.
|
||||||
|
|
||||||
|
## Trigger
|
||||||
|
|
||||||
|
L'un des suivants :
|
||||||
|
- Alerte automatique (UptimeRobot, monitoring, healthcheck failed)
|
||||||
|
- Report utilisateur (Slack, email, ticket)
|
||||||
|
- Detection logs anormaux
|
||||||
|
|
||||||
|
## Severites
|
||||||
|
|
||||||
|
| Niveau | Definition | Reponse cible |
|
||||||
|
|--------|-----------|---------------|
|
||||||
|
| **SEV1 (CRITICAL)** | Service down complet ou data loss en cours | < 15 min |
|
||||||
|
| **SEV2 (WARNING)** | Degradation majeure, partie indisponible, perte donnees evitee | < 4h ouvrees |
|
||||||
|
| **SEV3 (INFO)** | Bug isole, workaround possible | < 24h ouvrees |
|
||||||
|
|
||||||
|
## Acteurs
|
||||||
|
|
||||||
|
- **Corentin** (oncall principal)
|
||||||
|
- **Yan** (oncall backup)
|
||||||
|
- **acadenice-devops** (investigation + restore)
|
||||||
|
- **bridge-dev** (si bug code)
|
||||||
|
- **bridge-tester** (regression test post-fix)
|
||||||
|
|
||||||
|
## Sequence — SEV1 (service down)
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] DETECT (auto ou manuel)
|
||||||
|
- Alerte UptimeRobot/Slack/email
|
||||||
|
- Confirmer le scope : qui est down, depuis quand, quoi est perdu
|
||||||
|
- Output : situation comprise
|
||||||
|
|
||||||
|
[2] TRIAGE (15 min, Corentin oncall)
|
||||||
|
- Severite confirmee SEV1 ?
|
||||||
|
- Notifier Yan + Ludo si data loss
|
||||||
|
- Annoncer canal #ops + banner status si user-facing :
|
||||||
|
"[SEV1] formation-hub - investigating, ETA <unknown>"
|
||||||
|
- Output : equipe alertee
|
||||||
|
|
||||||
|
[3] INVESTIGATE (acadenice-devops)
|
||||||
|
- Verifier containers : docker compose ps
|
||||||
|
- Verifier healthcheck : ./scripts/healthcheck.sh
|
||||||
|
- Verifier logs : docker compose logs --tail=200 <service>
|
||||||
|
- Verifier metrics : CPU, memoire, disque
|
||||||
|
- Verifier deps : Postgres, Redis joignables ?
|
||||||
|
- Output : root cause identifie ou hypothese forte
|
||||||
|
|
||||||
|
[4] MITIGATE (acadenice-devops + bridge-dev si code)
|
||||||
|
- Selon root cause :
|
||||||
|
* Service down : restart container, verifier ressources
|
||||||
|
* DB corruption : restore backup recent
|
||||||
|
* Bug code : rollback version precedente (cf release.md)
|
||||||
|
* Compromission : rotate secrets, isoler env
|
||||||
|
* Disque plein : cleanup logs/backups, upsizing
|
||||||
|
- Output : service restored
|
||||||
|
|
||||||
|
[5] VERIFY (Corentin + acadenice-devops)
|
||||||
|
- Healthcheck full : 4/4 OK
|
||||||
|
- Smoke test : ./scripts/smoke-test.sh
|
||||||
|
- Tester un flow utilisateur reel
|
||||||
|
- Output : confirmation prod restoree
|
||||||
|
|
||||||
|
[6] COMMUNICATE (Corentin)
|
||||||
|
- Slack/Teams : "[SEV1 RESOLVED] formation-hub - back online. Cause: ..."
|
||||||
|
- Email all si data loss : compliance RGPD
|
||||||
|
- Update banner status : retire
|
||||||
|
- Output : equipe et users informes
|
||||||
|
|
||||||
|
[7] POST-MORTEM (sous 7 jours, Corentin + Yan)
|
||||||
|
- Creer doc : docs/post-mortems/YYYY-MM-DD-<title>.md
|
||||||
|
- Format blameless (focus systeme, pas la personne)
|
||||||
|
- Sections : Timeline / Impact / Root cause / AI / Lessons learned
|
||||||
|
- Action items (AI) : owner + due date
|
||||||
|
- Partager avec equipe
|
||||||
|
- Update runbooks si pattern recurrent
|
||||||
|
- Output : post-mortem publie + AI ouverts
|
||||||
|
```
|
||||||
|
|
||||||
|
## Sequence — SEV2 (degradation)
|
||||||
|
|
||||||
|
Idem SEV1 mais sans urgence < 15 min. Reponse cible 4h ouvrees. Pas d'annonce email all sauf si user-facing.
|
||||||
|
|
||||||
|
## Sequence — SEV3 (bug isole)
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Triager via GitHub/Forgejo issue avec label `bug` + severite `low`
|
||||||
|
[2] Assigner a bridge-dev pour fix dans la prochaine release
|
||||||
|
[3] Si workaround dispo : documenter dans le ticket
|
||||||
|
[4] Pas de post-mortem (sauf pattern recurrent)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Comm template SEV1/2 pendant incident
|
||||||
|
|
||||||
|
```
|
||||||
|
[SEV1] formation-hub - Service degraded
|
||||||
|
Symptom: <quoi exactement>
|
||||||
|
Started: <quand>
|
||||||
|
Investigating: <ou on en est>
|
||||||
|
ETA: <estimate restore ou "investigating">
|
||||||
|
Channel: #ops
|
||||||
|
```
|
||||||
|
|
||||||
|
Mise a jour toutes les 30 min minimum.
|
||||||
|
|
||||||
|
## Comm template SEV1 resolved
|
||||||
|
|
||||||
|
```
|
||||||
|
[SEV1 RESOLVED] formation-hub - back online
|
||||||
|
Duration down: <X>h<Y>m
|
||||||
|
Root cause: <one-liner>
|
||||||
|
Impact: <users affectes, data loss oui/non>
|
||||||
|
Post-mortem: docs/post-mortems/YYYY-MM-DD-<title>.md (publie sous 7j)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Post-mortem template
|
||||||
|
|
||||||
|
`docs/post-mortems/YYYY-MM-DD-<titre-incident>.md` :
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
# Post-mortem : <titre incident>
|
||||||
|
|
||||||
|
## Timeline (heures locales)
|
||||||
|
- HH:MM detection
|
||||||
|
- HH:MM triage
|
||||||
|
- HH:MM mitigation start
|
||||||
|
- HH:MM service restored
|
||||||
|
- HH:MM root cause confirmed
|
||||||
|
|
||||||
|
## Impact
|
||||||
|
- Duree downtime : Xh Ym
|
||||||
|
- Users impactes : Y
|
||||||
|
- Data loss : oui/non, si oui : combien et quoi
|
||||||
|
- Cout estime : XX€ (si quantifiable)
|
||||||
|
|
||||||
|
## Root cause
|
||||||
|
<un paragraphe : ce qui a casse + pourquoi>
|
||||||
|
|
||||||
|
## Pourquoi notre monitoring n'a pas alerte plus tot ?
|
||||||
|
<analyse honnete - blind spot detection ?>
|
||||||
|
|
||||||
|
## Action items
|
||||||
|
- [ ] AI 1 : <description> (owner @who, due YYYY-MM-DD)
|
||||||
|
- [ ] AI 2 : ...
|
||||||
|
|
||||||
|
## Lessons learned
|
||||||
|
<que retenir pour eviter recurrence>
|
||||||
|
|
||||||
|
## Mention blameless
|
||||||
|
Cet incident n'est pas la faute d'une personne. C'est un manque de garde-fous systeme. AIs au-dessus visent a ajouter ces garde-fous.
|
||||||
|
```
|
||||||
|
|
||||||
|
## Runbooks lies (a creer Phase 1)
|
||||||
|
|
||||||
|
Dans `docs/runbooks/` :
|
||||||
|
- `runbook-docmost-down.md`
|
||||||
|
- `runbook-baserow-down.md`
|
||||||
|
- `runbook-disk-full.md`
|
||||||
|
- `runbook-postgres-corrupted.md`
|
||||||
|
- `runbook-restore-from-backup.md`
|
||||||
|
- `runbook-rotate-secrets.md`
|
||||||
|
|
||||||
|
Format runbook :
|
||||||
|
```
|
||||||
|
# Runbook : <INCIDENT_TYPE>
|
||||||
|
## Symptomes
|
||||||
|
## Diagnostic (etapes)
|
||||||
|
## Resolution (etapes)
|
||||||
|
## Prevention future
|
||||||
|
## Rollback / escalade
|
||||||
|
```
|
||||||
|
|
||||||
|
## On-call rotation
|
||||||
|
|
||||||
|
Phase 0/1 : **Corentin = oncall principal**, Yan = backup.
|
||||||
|
|
||||||
|
Si embauche futur :
|
||||||
|
- Rotation hebdo
|
||||||
|
- Handoff weekly avec recap
|
||||||
|
- Compensation oncall (jour off ou prime)
|
||||||
|
|
||||||
|
## Limites
|
||||||
|
|
||||||
|
- Pas de SLA strict pour Phase 1 (outil interne, pas critique 24/7). Best effort.
|
||||||
|
- Pas de status page publique en Phase 1 (info via Slack interne suffit).
|
||||||
|
- Phase 3+ : si on ouvre l'outil a clients externes, considere SLA + status page.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Apres incident SEV1/2 : update doc 18 section 6 (runbooks) si pattern detecte
|
||||||
|
- Apres 3 incidents similaires en 1 mois : escalade strategique (refactor architecture, ressources additionnelles, etc.)
|
||||||
146
.claude/workflows/release.md
Normal file
146
.claude/workflows/release.md
Normal file
|
|
@ -0,0 +1,146 @@
|
||||||
|
# Workflow : RELEASE PROD
|
||||||
|
|
||||||
|
Process de release semver vers production. Cf doc 17 section 4.
|
||||||
|
|
||||||
|
## Trigger
|
||||||
|
|
||||||
|
- Ensemble de stories mergees sur main, pretes pour prod
|
||||||
|
- Corentin decide "on release" → tag semver
|
||||||
|
|
||||||
|
## Acteurs
|
||||||
|
|
||||||
|
- **Corentin** (decisionnaire + approbateur)
|
||||||
|
- **bridge-tester** (validation E2E staging)
|
||||||
|
- **acadenice-devops** (deploy + watch + rollback si besoin)
|
||||||
|
- (optionnel) **Yan** (approbateur backup pour deploy prod)
|
||||||
|
|
||||||
|
## Pre-requis
|
||||||
|
|
||||||
|
- Tous les CI sur main = vert
|
||||||
|
- Tests E2E staging = vert
|
||||||
|
- Backup recent (< 24h) verifie
|
||||||
|
- Pas de creneau metier critique (cours en cours, deadline saisie heures)
|
||||||
|
|
||||||
|
## Sequence
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Decision release (Corentin)
|
||||||
|
- Lister les commits sur main depuis derniere release : git log v<last>..HEAD --oneline
|
||||||
|
- Decider type release : MAJOR / MINOR / PATCH (semver)
|
||||||
|
- Output : decision + version cible
|
||||||
|
|
||||||
|
[2] Update CHANGELOG.md (Corentin OU bridge-dev assist)
|
||||||
|
- Deplacer section [Unreleased] vers nouvelle section [vX.Y.Z]
|
||||||
|
- Ajouter date
|
||||||
|
- Verifier que toutes les entries sont la
|
||||||
|
- Commit : `docs(changelog): release vX.Y.Z`
|
||||||
|
- Output : CHANGELOG.md a jour
|
||||||
|
|
||||||
|
[3] E2E tests staging (bridge-tester via CI)
|
||||||
|
- Trigger : push sur main fait deja le deploy staging auto
|
||||||
|
- Verifier : workflow e2e.yml passe (Playwright sur staging URL)
|
||||||
|
- Si fail : retour fix avant release
|
||||||
|
- Output : E2E status
|
||||||
|
|
||||||
|
[4] Validation manuelle staging (Corentin)
|
||||||
|
- Tester quelques flows critiques sur staging URL :
|
||||||
|
* Login Docmost
|
||||||
|
* Creation page + share link
|
||||||
|
* Saisie heures realisees (UC-13)
|
||||||
|
* Creation projet + tache (UCA-02 + UCA-03)
|
||||||
|
- Output : sign-off staging
|
||||||
|
|
||||||
|
[5] Backup verification (acadenice-devops)
|
||||||
|
- Verifier dernier backup < 24h existe
|
||||||
|
- Optionnel : declencher backup ad-hoc avant deploy prod
|
||||||
|
- Output : backup verifie
|
||||||
|
|
||||||
|
[6] Tag semver + push (Corentin)
|
||||||
|
- git tag -a vX.Y.Z -m "Release vX.Y.Z — <one-liner>"
|
||||||
|
- git push origin vX.Y.Z (ou push selfhost selon source of truth)
|
||||||
|
- Trigger : workflow deploy-prod.yml se declenche
|
||||||
|
- Output : tag prod cree
|
||||||
|
|
||||||
|
[7] Approval review (Yan ou Corentin)
|
||||||
|
- GitHub UI : environment 'production' demande required reviewer
|
||||||
|
- Approver dans GitHub Actions UI
|
||||||
|
- Output : approval enregistre
|
||||||
|
|
||||||
|
[8] Deploy prod execute (acadenice-devops via deploy-prod.yml)
|
||||||
|
- SSH prod host
|
||||||
|
- git checkout vX.Y.Z
|
||||||
|
- docker compose -f compose.yml -f compose.prod.yml pull
|
||||||
|
- docker compose -f compose.yml -f compose.prod.yml up -d
|
||||||
|
- Healthcheck post-deploy
|
||||||
|
- Output : prod deploye
|
||||||
|
|
||||||
|
[9] Smoke tests prod (acadenice-devops + script)
|
||||||
|
- Run scripts/smoke-test.sh contre PROD_URL
|
||||||
|
- Verifier 3 endpoints critiques
|
||||||
|
- Output : smoke OK / KO
|
||||||
|
|
||||||
|
[10] Watch period (Corentin + acadenice-devops, 30 min)
|
||||||
|
- Surveiller logs containers : docker compose logs -f --tail=200
|
||||||
|
- Surveiller monitoring : UptimeRobot + (Phase 3+) Prometheus/Grafana
|
||||||
|
- Surveiller saisies utilisateur : pas de chute brutale ?
|
||||||
|
- Output : 30 min vert ou alerte
|
||||||
|
|
||||||
|
[11] Annonce release (Corentin)
|
||||||
|
- Slack/Teams interne : "Released vX.Y.Z. Highlights: ..."
|
||||||
|
- Mettre a jour CHANGELOG.md commit dans release notes GitHub/Forgejo
|
||||||
|
- Output : equipe informee
|
||||||
|
|
||||||
|
[12] Si tout OK : RELEASE COMPLETE
|
||||||
|
- Notifier ops : new version live
|
||||||
|
- Si KO : declencher WF rollback (cf incident.md)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Gates humains bloquants
|
||||||
|
|
||||||
|
| Gate | Decision | Owner |
|
||||||
|
|------|----------|-------|
|
||||||
|
| Validation manuelle staging (4) | OK / RETOUR FIX | Corentin |
|
||||||
|
| Tag semver (6) | release ou abort | Corentin |
|
||||||
|
| Approval prod (7) | APPROVE / DENY | Yan ou Corentin (manual GitHub UI) |
|
||||||
|
| Watch period (10) | tout OK / rollback | Corentin |
|
||||||
|
|
||||||
|
## Rollback (en cas d'echec)
|
||||||
|
|
||||||
|
Cf doc 17 section 6 + workflow `incident.md` :
|
||||||
|
|
||||||
|
| Scenario | Action |
|
||||||
|
|----------|--------|
|
||||||
|
| Healthcheck KO post-deploy | Re-deploy version precedente : `git tag vX.Y.Z-rollback v<previous> && git push --tags` → trigger deploy-prod.yml |
|
||||||
|
| Bug critique decouvert dans watch period | Idem rollback automatique vers version stable |
|
||||||
|
| Migration schema casse rollups | Restore Postgres backup pre-deploy + redeploy version stable |
|
||||||
|
| Compromission credentials post-deploy | Rotate secrets + redeploy + audit logs |
|
||||||
|
|
||||||
|
## Outputs
|
||||||
|
|
||||||
|
- Tag semver cree sur main
|
||||||
|
- Image Docker tagged + pushed registry
|
||||||
|
- Prod deployee + verifiee
|
||||||
|
- CHANGELOG release section publiee
|
||||||
|
- Notification equipe envoyee
|
||||||
|
- (Si rollback) post-mortem dans `docs/post-mortems/YYYY-MM-DD-<title>.md`
|
||||||
|
|
||||||
|
## Convention semver (rappel)
|
||||||
|
|
||||||
|
| Type | Quand | Exemple |
|
||||||
|
|------|-------|---------|
|
||||||
|
| MAJOR | Breaking change (migration data forcee, rupture API) | v1.x.x → v2.0.0 |
|
||||||
|
| MINOR | Nouvelle feature backward-compatible | v1.2.x → v1.3.0 |
|
||||||
|
| PATCH | Bug fix / security fix | v1.2.3 → v1.2.4 |
|
||||||
|
|
||||||
|
## Frequence de release
|
||||||
|
|
||||||
|
- **Phase 1 vanilla** : release initiale v0.1.0 quand Phase 1 stable + utilisee 1 semaine
|
||||||
|
- **Phase 2 bridge** : releases v0.2.x → v0.9.x au fil des stories validees
|
||||||
|
- **Phase 3 maturite** : v1.0.0 quand bidirec backlinks + dual-mode editor + MCP server livres
|
||||||
|
- **Phase 4+** : releases mensuelles minimum
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- **Pas de release vendredi soir** (tradition tech : eviter d'avoir a fixer en weekend)
|
||||||
|
- **Pas de release pendant fenetre maintenance Acadenice** (cours en cours, etc.)
|
||||||
|
- Si urgent en prod : hotfix branch depuis tag stable, micro-release patch (ex: v1.2.4 → v1.2.5)
|
||||||
144
.claude/workflows/sync-bidirec.md
Normal file
144
.claude/workflows/sync-bidirec.md
Normal file
|
|
@ -0,0 +1,144 @@
|
||||||
|
# Workflow : SYNC BIDIREC Docmost ↔ Baserow
|
||||||
|
|
||||||
|
Orchestration de la synchronisation bidirectionnelle entre Docmost (wiki) et Baserow (DBs). Phase 2 — necessite que le bridge service soit deploye et operationnel.
|
||||||
|
|
||||||
|
Equivalent BYAN-natif : event-driven workflow avec idempotence.
|
||||||
|
|
||||||
|
## Trigger
|
||||||
|
|
||||||
|
L'un des suivants :
|
||||||
|
- Webhook Baserow `row.created` / `row.updated` / `row.deleted` sur table donnee
|
||||||
|
- Webhook Docmost `page.created` (si configure cote Docmost custom)
|
||||||
|
- Action explicite admin : "Sync forcee projet 42 → Docmost"
|
||||||
|
- Cron periodique de reconciliation (Phase 3+)
|
||||||
|
|
||||||
|
## Acteurs
|
||||||
|
|
||||||
|
- **bridge-dev** (handler webhook + sync logic)
|
||||||
|
- **acadenice-devops** (config webhooks + monitoring)
|
||||||
|
- **bridge-tester** (validation idempotence + anti-loop)
|
||||||
|
- **Corentin** (alerte si depassement capacite)
|
||||||
|
|
||||||
|
## Sequence — type webhook Baserow row.created sur table 'projet'
|
||||||
|
|
||||||
|
```
|
||||||
|
[1] Webhook recu (bridge endpoint POST /api/webhooks/baserow/projet-changed)
|
||||||
|
- Verifier signature HMAC X-Baserow-Signature (anti-spoofing)
|
||||||
|
- Si invalide : log + 401, ABORT
|
||||||
|
- Output : event valide
|
||||||
|
|
||||||
|
[2] Idempotence check (bridge + Redis)
|
||||||
|
- Lire payload event_id
|
||||||
|
- Redis : SET bridge:webhook:event:<event_id> "1" EX 86400 NX
|
||||||
|
- Si SET retourne null (key existait) : event deja traite, ABORT 200
|
||||||
|
- Sinon : continue
|
||||||
|
- Output : event nouveau, marque traite
|
||||||
|
|
||||||
|
[3] Anti-loop check
|
||||||
|
- Verifier header X-Bridge-Origin sur la row Baserow
|
||||||
|
- Si X-Bridge-Origin == "bridge" : c'est nous qui avons cree la row, ABORT
|
||||||
|
- Sinon : c'est un user qui a cree, continue
|
||||||
|
- Output : event source legitime
|
||||||
|
|
||||||
|
[4] Logique metier (bridge service)
|
||||||
|
- Pour 'row.created' sur 'projet' :
|
||||||
|
* Fetch projet detail depuis Baserow (BaserowClient.getRow)
|
||||||
|
* Fetch client lie (BaserowClient.getRow)
|
||||||
|
* Calcul nom de page Docmost : "Projet [nom] - [client]"
|
||||||
|
* Determiner space cible : "Agence" → fetch space ID
|
||||||
|
- Output : payload pour creation Docmost
|
||||||
|
|
||||||
|
[5] Action Docmost (bridge service via DocmostClient)
|
||||||
|
- DocmostClient.createPage({ spaceId, title, content: template_projet(projet) })
|
||||||
|
- Header : X-Bridge-Origin: bridge (eviter loop futur)
|
||||||
|
- Output : pageId Docmost cree
|
||||||
|
|
||||||
|
[6] Update Baserow row (bridge service)
|
||||||
|
- BaserowClient.updateRow(projet_id, { docmost_page_id: pageId })
|
||||||
|
- Header : X-Bridge-Origin: bridge
|
||||||
|
- Output : projet Baserow enrichi avec docmost_page_id
|
||||||
|
|
||||||
|
[7] Cache invalidation (bridge + Redis)
|
||||||
|
- RedisCache.invalidatePattern("bridge:projet:*")
|
||||||
|
- RedisCache.invalidatePattern("bridge:client:<id>:projets")
|
||||||
|
- Output : caches invalides
|
||||||
|
|
||||||
|
[8] Notif si capacite formateur depassee (cas attribution)
|
||||||
|
- Si event = creation 'attribution' :
|
||||||
|
* Recalculer Personne.heures_restantes_total
|
||||||
|
* Si < 0 : notifier admin via SMTP/Slack
|
||||||
|
- Output : notification envoyee si depassement
|
||||||
|
|
||||||
|
[9] Audit log
|
||||||
|
- Log structurel : { event_id, source: 'baserow', target: 'docmost', action: 'createPage', success: true, duration_ms, ... }
|
||||||
|
- Output : trace persistee
|
||||||
|
|
||||||
|
[10] Reponse webhook
|
||||||
|
- Return 200 OK { processed: true, page_id: <docmost_page_id> }
|
||||||
|
```
|
||||||
|
|
||||||
|
## Patterns specifiques par event
|
||||||
|
|
||||||
|
| Trigger | Action sync |
|
||||||
|
|---------|-------------|
|
||||||
|
| Baserow row.created sur `projet` | Auto-create page Docmost dans space Agence |
|
||||||
|
| Baserow row.created sur `formation` | Auto-create collection Docmost (sub-pages par bloc) |
|
||||||
|
| Baserow row.updated sur `projet`/`formation` (titre, statut) | Update title/icon page Docmost liee |
|
||||||
|
| Baserow row.created sur `intervention` | Check capacite → notify admin si depassement |
|
||||||
|
| Baserow row.created sur `attribution` | Notify formateur (email) + check capacite |
|
||||||
|
| Docmost page.created (template specifique 'compte-rendu') | Auto-create row dans table `comptes_rendus` Baserow (Phase 3+) |
|
||||||
|
| Docmost share.created | Log audit + notify admin (alerte data leak risk) |
|
||||||
|
|
||||||
|
## Gates humains
|
||||||
|
|
||||||
|
Aucun gate bloquant — c'est event-driven temps reel. Mais :
|
||||||
|
- Notif Corentin sur depassement capacite (asynchrone)
|
||||||
|
- Notif Corentin sur erreurs critiques (sync echec apres 3 retry)
|
||||||
|
|
||||||
|
## Rollback / gestion d'erreurs
|
||||||
|
|
||||||
|
| Echec | Strategy |
|
||||||
|
|-------|----------|
|
||||||
|
| Docmost API down | Retry 3x exponential backoff. Si tjrs KO : queue Redis pour retry batch |
|
||||||
|
| Baserow row introuvable (race condition) | Fetch retry x2 avec 200ms delay. Sinon : log + skip event |
|
||||||
|
| Cache invalidation echec | Log warning, continuer (TTL fallback 5 min) |
|
||||||
|
| Notification SMTP fail | Log warning, alerte degraded |
|
||||||
|
| Loop detecte (X-Bridge-Origin manquant cote bridge writes) | URGENT : alerte Corentin, audit code bridge |
|
||||||
|
|
||||||
|
## Anti-loop strategy (CRITICAL)
|
||||||
|
|
||||||
|
Pour eviter Docmost → bridge → Baserow → bridge → Docmost → ... boucle infinie :
|
||||||
|
|
||||||
|
1. **Header X-Bridge-Origin** : tous les writes du bridge vers Baserow et Docmost ajoutent ce header
|
||||||
|
2. **Detection cote handler** : si l'event provient d'une row/page avec ce flag, ABORT
|
||||||
|
3. **Idempotence event_id** : meme si une boucle se forme, max 1 cycle (TTL 24h en Redis)
|
||||||
|
4. **Rate limit** : max 1 sync identique / 5 min sur entite (cle: `bridge:sync:<entity>:<id>`)
|
||||||
|
5. **Monitoring** : alerter si > 10 events identiques en 1 min (signe de boucle)
|
||||||
|
|
||||||
|
## Outputs
|
||||||
|
|
||||||
|
- Pages Docmost crees automatiquement
|
||||||
|
- Rows Baserow enrichies avec ids Docmost (lien bidirec)
|
||||||
|
- Caches invalides
|
||||||
|
- Audit log evenement traite
|
||||||
|
- Notifications metier si necessaire
|
||||||
|
|
||||||
|
## Tests obligatoires
|
||||||
|
|
||||||
|
- **Test idempotence** : envoyer le meme event 5 fois → un seul effet (bridge-tester)
|
||||||
|
- **Test anti-loop** : simuler bridge-write → verifier que webhook ignore (bridge-tester)
|
||||||
|
- **Test rate limit** : 100 events identiques en 1 min → verifier que rate limit kick in
|
||||||
|
- **Test recovery** : Docmost down 5 min → events queues et processed apres recovery
|
||||||
|
- **Test webhook signature invalid** : event avec mauvais HMAC → 401 (bridge-tester)
|
||||||
|
|
||||||
|
## Exemple invocation
|
||||||
|
|
||||||
|
Trigger non-manuel — se declenche automatiquement quand Baserow envoie un webhook au bridge. Mais peut etre invoque manuellement pour :
|
||||||
|
- Reconciliation : "WF SYNC : force re-sync de tous les projets non-mappes vers Docmost"
|
||||||
|
- Debug : "WF SYNC : trace l'event ID xyz pour comprendre pourquoi il a abort"
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Webhooks Baserow : a configurer cote Baserow UI ou API (apres deploy bridge)
|
||||||
|
- Endpoint signature secret : `BASEROW_WEBHOOK_SECRET` dans `.env` bridge
|
||||||
|
- Logs : toutes les operations sync sont loguees structurellement (Pino) avec event_id pour traceability
|
||||||
Loading…
Add table
Reference in a new issue