Wiki/docs/17-plan-deployment.md
Corentin JOGUET 668576cdc4 chore: initial commit — formation-hub conception phase
Conception complete (Phase 0) pour formation-hub Acadenice :

- 19 docs Merise Agile + UML + GitOps + plans (tests/deploy/ops/api)
  cf docs/00-readme.md pour l'index complet
- Stack Docker compose (Docmost + Baserow + Postgres + Redis + MinIO local FS)
  compose.yml + compose.staging.yml + compose.prod.yml
- CI/CD GitHub Actions skeleton (ci, deploy-staging, deploy-prod)
- Bridge service skeleton (Hono + TS + Biome + Vitest + zod + pino)
- Templates GitHub : PR + 3 issue types + CODEOWNERS + dependabot.yml
- Scripts ops : healthcheck, backup quotidien, smoke-test post-deploy
- LICENSE AGPL-3.0 + SECURITY.md + CONTRIBUTING.md + CHANGELOG.md
- Diagramme drawIO archi infra (XML importable dans diagrams.net)

Decisions structurelles enregistrees :
- Scope CFA + Agence avec entite PERSONNE pivot multi-roles (ADR-001)
- Stack composite Docmost AGPL + Baserow MIT + bridge custom (ADR-001)
- Path B : UX quasi-unified via Tiptap node-views custom (ADR-002)
- Monorepo trunk-based development (ADR-003)
- Postgres separe Docmost/Baserow (ADR-004)
- Bridge stack Node 22 + Hono (ADR-005)
- Repo neuf prefere a fork Docmost
- Prod-like des le jour 1 (pas MVP)
2026-05-07 12:16:19 +02:00

500 lines
16 KiB
Markdown

# Plan de deployment
> Strategie de deploiement : provisionnement, CI/CD detaille, releases, migrations, rollback.
> Complete `14-repo-structure-gitops.md` (qui pose la structure CI/CD).
## 1. Vue d'ensemble — 3 environnements
```mermaid
flowchart LR
Dev[Dev local<br/>make up<br/>fixtures seed] --> Push[git push origin main]
Push -->|auto| Staging[Staging<br/>wiki.staging.acadenice.fr<br/>data anonymisee]
Staging --> Tag[git tag v1.X.Y]
Tag -->|approval review| Prod[Prod<br/>wiki.acadenice.fr<br/>data reelle]
```
| Env | Trigger deploy | Approval | Data | Target |
|-----|----------------|----------|------|--------|
| local | `make up` | — | seed fixtures | dev quotidien |
| staging | push `main` | auto | anonymisee | qualif metier + E2E |
| prod | tag `v*` | manual reviewer | reelle | utilisateurs finaux |
## 2. Provisionnement infra
### 2.1 Hardware cible
| Env | Specs | Cout/mois | Provider candidat |
|-----|-------|-----------|-------------------|
| staging | 2 vCPU, 4 Go RAM, 40 Go SSD | ~7€ | Hetzner CX21 ou OVH equivalent |
| prod | 4 vCPU, 8 Go RAM, 80 Go SSD | ~15€ | Hetzner CPX31 |
| backup distant | 100 Go object storage | ~5€ | Backblaze B2 ou OVH Object Storage |
### 2.2 OS et stack base
- **OS** : Debian 12 (stable, support long, deja maitrise par Corentin)
- **Docker** : version 25+ via repo officiel
- **Docker Compose** : v2 (plugin standard)
- **Reverse proxy** : Traefik 3 (deja en place sur Acadenice)
- **Cron** : crond systeme pour backups nocturnes
### 2.3 DNS et TLS
| Sous-domaine | Pointe vers | TLS |
|--------------|-------------|-----|
| `wiki.acadenice.fr` | VPS prod | Let's Encrypt via Traefik |
| `baserow.acadenice.fr` | VPS prod | Let's Encrypt via Traefik |
| `bridge.acadenice.fr` | VPS prod | Let's Encrypt via Traefik (Phase 2+) |
| `wiki.staging.acadenice.fr` | VPS staging | Let's Encrypt |
| `baserow.staging.acadenice.fr` | VPS staging | Let's Encrypt |
Traefik genere les certificats automatiquement via ACME (HTTP-01 challenge).
### 2.4 Provisionnement initial (premiere fois)
```bash
# 1. SSH sur le VPS frais
ssh root@<vps-ip>
# 2. Hardening de base
adduser corentin --gecos ""
usermod -aG sudo,docker corentin
ssh-copy-id corentin@<vps-ip>
# Editer /etc/ssh/sshd_config :
# PermitRootLogin no
# PasswordAuthentication no
systemctl restart sshd
# 3. Installer Docker
curl -fsSL https://get.docker.com | sh
# 4. Cloner le repo
mkdir -p /opt/formation-hub && cd /opt/formation-hub
git clone git@github.com:acadenice/formation-hub.git .
# 5. Configurer .env
cp .env.example .env.staging # ou .env.prod
nano .env.staging # remplir avec secrets reels
# 6. Lancer
docker compose -f compose.yml -f compose.staging.yml up -d
# 7. Verifier
./scripts/healthcheck.sh
```
Note : **a executer une seule fois** par environnement. Apres, c'est CI/CD qui prend le relais.
## 3. CI/CD detaille
### 3.1 Vue d'ensemble des workflows
| Workflow | Trigger | Duree max | Bloque sur echec |
|----------|---------|-----------|------------------|
| `ci.yml` | push + PR | 10 min | Merge bloque |
| `deploy-staging.yml` | push `main` (apres CI vert) | 5 min | Pas de deploy si CI rouge |
| `deploy-prod.yml` | tag `v*` | 5 min + approval | Pas de deploy sans approval |
| `nightly-backup-test.yml` | cron 03:00 mensuel | 30 min | Alerte slack si fail |
| `e2e.yml` | apres `deploy-staging` reussi | 15 min | Pas bloquant pour staging mais bloquant pour tag prod |
### 3.2 Workflow `ci.yml` (full)
```yaml
name: CI
on:
push:
branches-ignore: [main]
pull_request:
branches: [main]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 22, cache: 'npm', cache-dependency-path: 'bridge/package-lock.json' }
- run: cd bridge && npm ci
- run: cd bridge && npm run lint
type-check:
runs-on: ubuntu-latest
needs: lint
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 22, cache: 'npm', cache-dependency-path: 'bridge/package-lock.json' }
- run: cd bridge && npm ci
- run: cd bridge && npm run typecheck
test-unit:
runs-on: ubuntu-latest
needs: type-check
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 22 }
- run: cd bridge && npm ci
- run: cd bridge && npm run test:unit -- --coverage
- uses: actions/upload-artifact@v4
with:
name: coverage-unit
path: bridge/coverage
test-integration:
runs-on: ubuntu-latest
needs: type-check
services:
postgres:
image: postgres:16-alpine
env: { POSTGRES_PASSWORD: test, POSTGRES_DB: testdb }
ports: ['5432:5432']
options: --health-cmd pg_isready --health-interval 5s --health-retries 10
redis:
image: redis:7-alpine
ports: ['6379:6379']
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: 22 }
- run: cd bridge && npm ci
- run: cd bridge && npm run test:integration
env:
DATABASE_URL: postgresql://postgres:test@localhost:5432/testdb
REDIS_URL: redis://localhost:6379
security:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with: { fetch-depth: 0 }
- name: Secret scanning
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.pull_request.base.sha || github.event.before }}
- name: SAST
uses: returntocorp/semgrep-action@v1
with: { config: 'p/javascript p/typescript p/security-audit' }
- name: Dep audit
run: cd bridge && npm audit --audit-level=high
- name: License check
run: cd bridge && npx license-checker --failOn 'GPL-3.0;AGPL-3.0' --excludePackages 'bridge'
docker-build:
runs-on: ubuntu-latest
needs: [test-unit, test-integration, security]
steps:
- uses: actions/checkout@v4
- run: docker compose build
- run: docker compose up -d
- run: ./scripts/healthcheck.sh
- run: docker compose down -v
```
### 3.3 Workflow `deploy-staging.yml`
```yaml
name: Deploy Staging
on:
push:
branches: [main]
workflow_run:
workflows: ['CI']
types: [completed]
branches: [main]
jobs:
deploy:
if: ${{ github.event.workflow_run.conclusion == 'success' || github.event_name == 'push' }}
runs-on: ubuntu-latest
environment: staging
steps:
- uses: actions/checkout@v4
- name: Build & push image
run: |
docker build -t registry.acadenice.fr/formation-hub/bridge:${{ github.sha }} bridge/
echo "${{ secrets.REGISTRY_PASSWORD }}" | docker login registry.acadenice.fr -u "${{ secrets.REGISTRY_USER }}" --password-stdin
docker push registry.acadenice.fr/formation-hub/bridge:${{ github.sha }}
- name: Deploy via SSH
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.STAGING_HOST }}
username: corentin
key: ${{ secrets.STAGING_SSH_KEY }}
script: |
cd /opt/formation-hub
git fetch && git checkout ${{ github.sha }}
export BRIDGE_IMAGE=registry.acadenice.fr/formation-hub/bridge:${{ github.sha }}
docker compose -f compose.yml -f compose.staging.yml pull
docker compose -f compose.yml -f compose.staging.yml up -d
./scripts/healthcheck.sh
- name: Notify Slack on failure
if: failure()
uses: slackapi/slack-github-action@v1
with:
payload: '{"text":"Deploy staging FAILED — sha ${{ github.sha }}"}'
env: { SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK }} }
```
### 3.4 Workflow `deploy-prod.yml`
Identique mais cible prod, avec `environment: production` qui active les **required reviewers** (Yan ou Corentin doivent approuver dans GitHub UI).
```yaml
name: Deploy Production
on:
push:
tags: ['v*']
jobs:
deploy:
runs-on: ubuntu-latest
environment: production # required reviewers: Yan, Corentin
steps:
- uses: actions/checkout@v4
with: { ref: ${{ github.ref_name }} }
- name: Tag image as prod
run: |
docker pull registry.acadenice.fr/formation-hub/bridge:${{ github.sha }}
docker tag registry.acadenice.fr/formation-hub/bridge:${{ github.sha }} registry.acadenice.fr/formation-hub/bridge:${{ github.ref_name }}
docker push registry.acadenice.fr/formation-hub/bridge:${{ github.ref_name }}
- name: Deploy via SSH
uses: appleboy/ssh-action@v1
with:
host: ${{ secrets.PROD_HOST }}
username: corentin
key: ${{ secrets.PROD_SSH_KEY }}
script: |
cd /opt/formation-hub
git fetch --tags && git checkout ${{ github.ref_name }}
export BRIDGE_IMAGE=registry.acadenice.fr/formation-hub/bridge:${{ github.ref_name }}
docker compose -f compose.yml -f compose.prod.yml pull
docker compose -f compose.yml -f compose.prod.yml up -d
./scripts/healthcheck.sh
- name: Update CHANGELOG
run: # commit et push CHANGELOG dans une PR auto si pas deja fait
- name: Notify Slack
uses: slackapi/slack-github-action@v1
with:
payload: '{"text":"PROD deployed: ${{ github.ref_name }}"}'
```
## 4. Strategie de release
### 4.1 Convention semver
| Type | Quand | Exemple |
|------|-------|---------|
| MAJOR | Breaking change | v1.x.x → v2.0.0 (changement schema Baserow incompatible) |
| MINOR | Nouvelle feature backward-compatible | v1.2.x → v1.3.0 (ajout endpoint bridge) |
| PATCH | Bug fix / security fix | v1.2.3 → v1.2.4 |
### 4.2 Process de release
```
1. PRs mergees sur main → deploy staging auto
2. QA staging (UX checklist + E2E run)
3. Si OK :
- Update CHANGELOG.md (deplacer "Unreleased" → version)
- git tag -a v1.2.3 -m "Release v1.2.3 — features:..."
- git push origin v1.2.3
4. GitHub Action deploy-prod se declenche
5. Approval review (Yan ou Corentin)
6. Deploy execute
7. Post-deploy : monitoring 30 min
8. Si issue : rollback (cf section 6)
```
### 4.3 Format CHANGELOG
```markdown
## [Unreleased]
### Added
- Feature X
### Changed
- Y
### Fixed
- Z
## [1.2.3] - 2026-06-15
### Added
- Endpoint /personnes/:id/timeline
### Fixed
- Recalcul rollup formation depuis attribution annulee
```
Convention : [Keep a Changelog](https://keepachangelog.com).
## 5. Migration database
### 5.1 Strategie
- **Baserow** : modifications de schema via UI Baserow OU API. Versionner les schemas dans `baserow/schemas/*.json` (export periodique).
- **Bridge service** : pas de DB propre en Phase 2 (stateless). Si plus tard on ajoute Postgres dedie : utiliser Drizzle migrations versionnees dans `bridge/migrations/`.
### 5.2 Pre-migration checklist
```
[ ] Backup Baserow + Postgres docmost FRESH avant migration
[ ] Test migration sur staging avec data realiste anonymisee
[ ] Verification integrite post-migration sur staging
[ ] Plan rollback documente (revert schema OU restore backup)
[ ] Annonce equipe : window de maintenance prevue
[ ] Migration prod sur creneau low-traffic (early morning ou weekend)
```
### 5.3 Pendant la migration
```bash
# 1. Stop services (eviter ecritures concurrentes)
docker compose stop docmost baserow
# 2. Backup explicite
make backup
# 3. Run migration script (manuel ou via bridge command)
./scripts/migrate.sh v1.2.3
# 4. Verification integrite
./scripts/healthcheck.sh
./scripts/verify-rollups.sh
# 5. Restart services
docker compose -f compose.yml -f compose.prod.yml up -d
# 6. Smoke test post-migration
curl -fsS https://wiki.acadenice.fr/api/health
```
### 5.4 Communication metier
Avant migration affectant prod :
- Email aux admins 48h avant
- Banner Docmost "maintenance prevue le X de Yh a Zh"
- Slack #internal au debut et fin
## 6. Rollback strategy detaillee
| Scenario | Action | Duree estimee |
|----------|--------|---------------|
| **Bug critique post-deploy prod** (regression majeure) | Re-deploy version precedente : `git tag v1.2.3-rollback v1.2.2 && git push --tags` → trigger deploy-prod sur la version stable | 5-10 min |
| **Migration schema casse rollups** | 1) `docker compose stop` 2) Restore Postgres docmost depuis backup 3) Restore Baserow data 4) Redeploy version stable | 30-60 min |
| **Compromission credentials** | 1) Revoke tokens API 2) Rotate secrets `.env.prod` 3) Redeploy 4) Audit logs 5) Communiquer si data leak | 1-4h |
| **VPS down** (provider issue) | Failover manuel vers VPS backup OU attendre provider | depend incident |
| **Bug minor en staging** | Hotfix sur main + redeploy staging. Pas de tag prod. | 10-20 min |
### 6.1 Pre-prod rollback test
Mensuel, sur staging :
1. Deploy une version
2. Simuler bug (fail healthcheck volontaire)
3. Re-deploy version precedente
4. Verifier que tout fonctionne
5. Logger le test dans le journal ops
## 7. Configuration env-specific
### 7.1 Variables par env
| Variable | local | staging | prod |
|----------|-------|---------|------|
| `DOCMOST_URL` | http://localhost:3000 | https://wiki.staging.acadenice.fr | https://wiki.acadenice.fr |
| `BASEROW_URL` | http://localhost:8080 | https://baserow.staging.acadenice.fr | https://baserow.acadenice.fr |
| `BRIDGE_URL` (Phase 2) | http://localhost:4000 | https://bridge.staging.acadenice.fr | https://bridge.acadenice.fr |
| `LOG_LEVEL` | debug | info | warn |
| `BACKUP_S3_BUCKET` | (none) | (none) | s3://acadenice-formation-hub-backup |
| `SENTRY_DSN` (Phase 3+) | (none) | https://...sentry.io/staging | https://...sentry.io/prod |
### 7.2 Secret management workflow
```
1. Generer un nouveau secret (random 32+ chars)
2. Stocker dans pass / 1Password / Vault interne
3. Set en GitHub Secret (env-scoped)
4. Set en .env.staging / .env.prod sur le serveur (perms 600, owner root)
5. Test deploy
6. Rotater l'ancien secret (revoque cote service)
```
### 7.3 Frequence rotation
| Type secret | Frequence rotation |
|-------------|-------------------|
| API tokens externes (Outline, etc.) | Annuelle |
| DB passwords | Trimestrielle |
| JWT signing keys | Trimestrielle |
| SSH keys deploy | Annuelle ou sur depart |
| Backup encryption keys | Conserve hors-bande, rotate seulement si compromise |
## 8. Pre-deploy checklist (par release)
```
[ ] CI vert sur la PR
[ ] Tests E2E staging passent
[ ] CHANGELOG.md a jour
[ ] Migration data documentee si schema change
[ ] Pre-prod rollback test recent (< 1 mois)
[ ] Pas de PR open critique
[ ] Backup recent (< 24h) verifie
[ ] Approval reviewer disponible (Yan ou Corentin)
[ ] Pas de creneau metier critique (cours en cours / saisie deadline)
```
## 9. Post-deploy validation
### 9.1 Smoke tests automatiques (script)
```bash
# scripts/smoke-test.sh
ENV_URL=$1 # https://wiki.acadenice.fr ou staging
set -e
# 1. Healthcheck
curl -fsS "$ENV_URL/api/health" || exit 1
# 2. Login admin (test creds)
curl -fsS -X POST -H "Content-Type: application/json" \
-d '{"email":"smoke@acadenice.fr","password":"..."}' \
"$ENV_URL/api/auth.email" > /dev/null
# 3. Lecture page test
curl -fsS "$ENV_URL/api/documents.info" -H "Authorization: Bearer $TOKEN" \
-d '{"id":"smoke-test-page"}' > /dev/null
# 4. Test recherche
curl -fsS "$ENV_URL/api/documents.search" -H "Authorization: Bearer $TOKEN" \
-d '{"query":"smoke"}' > /dev/null
echo "Smoke tests OK"
```
### 9.2 Manual checklist post-deploy
```
[ ] Smoke tests automatiques verts
[ ] Login Docmost web OK
[ ] Login Baserow web OK
[ ] Pages wiki recentes accessibles
[ ] Saisie test sur Baserow OK (heures realisees ou intervention)
[ ] Diagrammes Mermaid rendent OK sur une page test
[ ] Logs containers : pas d'erreurs dans les 5 dernieres minutes
[ ] Metriques system : CPU < 50%, RAM < 70% (apres charge initiale)
```
### 9.3 Watch period
30 minutes apres deploy prod :
- Logs surveilles activement
- Metriques uptime monitoring
- Si anomalie : trigger rollback
## 10. Questions ouvertes
- [ ] Registry images : GitHub Container Registry, registry.acadenice.fr (a deployer), ou Harbor self-host ?
- [ ] Backup distant : OVH Object Storage / Backblaze / S3 ? Choix selon prix + souverainete
- [ ] Sentry pour error tracking (Phase 3+) ? Self-host ou SaaS ?
- [ ] CI runner : GitHub-hosted (cout) ou self-hosted runner sur VPS Acadenice ?
- [ ] Notification deploy : Slack, Teams, email ? Tous les 3 ?