Docs / Developer Documentation

6. Operational Documentation

6.1 Deployment procedures

Front-end (built today)

The app is a static export deployed to S3 behind CloudFront via AWS CDK (Python). The CDK app builds the site itself.

# 1. Install app dependencies (CDK runs next build during deploy)
npm install

# 2. Regenerate documentation artefacts (see build order below)
npm run gen:docs       # per-screen Markdown from the help registry
npm run build:docs     # docs site data module from /docs Markdown
npm run verify:help    # assert every walkthrough anchor resolves

# 3. Deploy infrastructure (builds out/, uploads, invalidates CDN)
cd infrastructure
python3 -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cdk deploy             # or ./deploy.sh

cdk deploy runs next build (which, with output: "export", emits out/), uploads out/ to the private S3 bucket via Origin Access Control, and issues a CloudFront /* invalidation, so visitors always get the latest build. To inspect without rebuilding, use cdk synth --app "python3 app.py" when out/ already exists.

Documentation build order (important)

The /docs site is generated, not hand-rendered. The order matters because each step feeds the next:

flowchart LR
  A[Edit help content + /docs Markdown] --> B[npm run gen:docs]
  B --> C[npm run build:docs]
  C --> D[lib/docs-generated.ts]
  D --> E[next build -> /docs/* static pages]

npm run gen:docs regenerates the per-screen Markdown in /docs from components/help/content/*.
npm run build:docs walks docs/product/**, docs/developer/** and the per-screen Markdown, converts to HTML, builds the navigation and search index, and writes lib/docs-generated.ts.
next build statically renders every /docs/* page (via generateStaticParams) into out/.

Commit the regenerated artefacts together with the source change. See AGENTS.md for the rule that help content, walkthroughs and /docs must stay in lock-step.

Back-end (production, Planned)

Domain services, the database and the event bus deploy as separate stacks (containers or serverless), promoted through environments by CI/CD with automated tests, migrations run with a forward-only, reversible strategy, and blue-green or canary cutover for zero-downtime releases. The static front-end is configured with the environment's API base URL.

6.2 Environment configuration

Variable	Scope	Purpose
`NEXT_PUBLIC_API_BASE_URL`	Front-end (public)	Base URL the `lib/data.ts` seam calls in production
`CERTIFICATE_ARN`	Infra (`config.py`)	ACM certificate in us-east-1 for CloudFront
`HOSTED_ZONE_NAME` / `HOSTED_ZONE_ID`	Infra (`config.py`)	Route 53 zone for the domain
`DOMAIN_NAME`	Infra (`config.py`)	Public hostname (today `mock.globalclinic.app`)
Database URL, KMS key ids, provider keys	Back-end (secrets store)	Never in source or the static bundle; server-side only

Configuration principles: only public values use NEXT_PUBLIC_*; everything secret lives in the secrets manager per environment; the bucket name is derived (gc-mocks-<account>-<region_without_dashes>), so no manual naming. Environments are isolated (dev, staging, production) with separate accounts or strict boundaries.

6.3 Monitoring and observability

Front-end / CDN. CloudFront and S3 metrics (requests, 4xx and 5xx, cache hit ratio, origin latency); real-user monitoring for web vitals; synthetic checks on the marketing site and /docs.
Back-end (Planned). Structured logs, metrics and distributed traces per service. Golden signals (latency, traffic, errors, saturation) per service and per integration adapter.
Business and SLO dashboards. Stage SLA performance and breaches (from stage owner/slaTarget), lead-to-case funnel by corridor and channel, time-to-first-response and time-to-quote, escrow release latency and reconciliation breaks, visa submission and grant rates, and communication delivery rates.
Alerting. Page on money path failures (escrow, refunds, reconciliation mismatch), visa SLA breaches, integration circuit-breaker trips, auth anomaly spikes, and screening hits. Route by severity to the on-call rotation.
Audit and security monitoring. Audit events and security signals flow to monitoring; anomalies (cross-tenant access attempts, failed-auth spikes) alert.

6.4 Backup and recovery

Front-end. The static site is reproducible from next build; the bucket is intentionally disposable. Recovery is a redeploy. Keep the source repository and CDK as the source of truth.
Database (Planned). Automated, encrypted backups with point-in-time recovery; periodic restore drills to validate backups; cross-region copies for disaster recovery.
Object storage (documents, Planned). Versioning and cross-region replication; lifecycle rules aligned to the retention schedule; deletes are soft and audited.
Ledger and audit. Append-only and backed up with the database; immutability preserved across restores.
Targets. Define and test RPO and RTO per data class; money and clinical data carry the strictest targets. Document the runbook for each recovery scenario.

6.5 Incident response procedures

flowchart LR
  D[Detect: alert / report] --> T[Triage + severity]
  T --> C[Contain]
  C --> M[Mitigate / restore]
  M --> R[Resolve + verify]
  R --> P[Postmortem + actions]

Severity. SEV1 (patient safety or money at risk, or a breach) pages 24/7; SEV2 (journey-blocking) responds within the stage SLA; SEV3 (degraded) next business day.
Roles. An incident commander coordinates; a communications lead handles internal and patient or clinic comms; subject experts mitigate. The Medical Director is engaged for any clinical-safety incident.
Data breach. A suspected breach triggers the breach workflow and the regulatory notification timelines in the compliance section (for example GDPR's 72-hour rule). Preserve audit logs; do not destroy evidence.
Money incidents. Place affected escrow tranches on hold; reconcile against the payment partner; never improvise a release.
Postmortem. Blameless, with timeline, root cause, and tracked corrective actions. The repository provides an incident-response workflow for triage and postmortem.

6.6 Support and troubleshooting

Support uses the contextual help system and the edge-case playbooks. Common issues and first steps:

Symptom	Likely cause	First step
Visa "Submit" stays disabled	A linked document is not yet verified	Open Documents; confirm the linked item is uploaded and verified; the visa item then reads Verified
Document upload rejected	File type or size, or failed scan	Re-upload a valid file within type and size limits
Payment failed at funding	Method or currency not valid for the corridor; provider decline	Retry or switch method; confirm no charge occurred; escalate to Finance if ambiguous
Travel option unavailable	Provider inventory changed	Travel Desk proposes an equivalent; selection stays editable
`/docs` page missing after a content change	`build:docs` not run before `next build`	Run `gen:docs` then `build:docs`, then rebuild
Walkthrough step points at nothing	A UI element or its `data-tour` anchor moved	Run `verify:help`; fix the anchor or step in the ScreenDoc
Clinic not visible on the marketplace	Clinic `status` is not `verified`	Complete onboarding and verification

Escalation: clinical to the Medical Director (24/7), money to Finance, visa or travel to the relevant desk, security or data to the incident process. Every support interaction that touches a patient case is consent-gated and audited.

6.7 Everyday commands

npm run dev          # local dev server
npm run typecheck    # tsc --noEmit
npm run verify:help  # validate walkthrough anchors
npm run gen:docs     # regenerate per-screen /docs from help content
npm run build:docs   # build the /docs site data module (lib/docs-generated.ts)
npm run build        # production static build into out/