System overview
A local-first Docker stack (cloud-portable) that moves every chart from intake to audit in one workflow.
The workflow
Coders code synthetic charts (ICD-10-CM diagnoses + procedures). The system validates against format / sequencing / comorbidity rules. Supervisors audit and return structured feedback. Managers watch a live queue/metrics dashboard.
Three roles
Default-deny RBAC. Coders see assigned charts and code them; supervisors audit and give feedback; managers import, assign, and watch metrics. Tenant scope is always applied on top of role.
Synthetic, yet HIPAA-grade
All patient data is generated by Synthea — no real PHI. The platform still enforces access control, audit logging, anti-enumeration auth, rate limiting and at-rest encryption posture for PHI-equivalent handling.
Component map
Request path top to bottom: browser → TLS proxy → API / SSR → data & jobs.
Layers & data flow
Controllers stay thin; services hold logic; only the server layer touches Postgres.
| Layer | Location | Responsibility |
|---|---|---|
| Presentation | apps/web | Server Components for first paint of queue/metrics; Client Components + TanStack Query for live data; SSE subscription for the dashboard (<1s). All design values come from packages/tokens. |
| API | apps/api/src/modules | NestJS controllers → services. Cross-cutting via guards/interceptors: AuthGuard (JWT cookie), RolesGuard (RBAC), RateLimitGuard, AuditInterceptor, SentryGlobalFilter. |
| Data access | apps/api/src/server | The only layer permitted to import the pg pool. Every scoped query goes through withOrgScope(orgId, fn) / withUserScope(userId, fn) (INV-1). |
| Persistence | db/migrations | PostgreSQL 17. Forward-only numbered SQL migrations. JSONB for stored FHIR chart payloads. |
| Async / jobs | apps/api/src/jobs | BullMQ on Redis. Processors: import (Synthea bundle → charts), outbox-relay (publish events), email (drain pending), metrics (recompute rollups). |
Coding-loop request path
A coder opens an assigned chart → enters codes with rationale → validation rules run (format / sequencing / comorbidity) and map errors back to the offending row → coder submits → chart enters the supervisor audit queue → supervisor scores, leaves findings + training feedback, and approves or returns. Every state-changing action writes an append-only, hash-chained audit_log row.
Stack & dependencies
Pinned versions; external SaaS is placeholder-gated and degrades gracefully when keys are absent.
| Dependency | Version | Role |
|---|---|---|
| Node.js | 22 LTS | Runtime |
| Next.js / React | 15.x / 19.x | Frontend (App Router) + UI |
| @tanstack/react-query | 5.x | Data fetching / cache |
| Tailwind CSS | 3.4.x | Styling (token-bound) |
| NestJS | 11.x | Backend framework |
| @nestjs/bullmq + bullmq | 11.x / 5.x | Background jobs |
| pg | 8.x | Postgres driver (raw, parameterized) |
| ioredis | 5.x | Redis client |
| zod | 3.x | Validation (shared schemas) |
| argon2 | 0.41.x | Password hashing (native; never substitute — INV-14) |
| jose | 5.x | JWT sign/verify |
| pino | 9.x | Structured JSON logging |
| PostgreSQL / Redis / Nginx | 17 / 7 / 1.27 | Database / queue+cache / TLS proxy |
| Playwright | 1.x | E2E — one spec per workflow |
| resend · dd-trace · @sentry/* · aws-sdk | gated | Email / APM / errors / secrets — placeholder-gated |
Reliability patterns
Dual-writes, inbound webhooks and email all avoid the "wrote one side, lost the other" failure mode.
Transactional outbox INV-5
- A domain mutation and its outbox_events row are written in one DB transaction.
- The outbox-relay worker publishes at-least-once.
- Consumers are idempotent, so replays are safe.
Webhook verify-first INV-4 · INV-6
- Verify HMAC-SHA256 over the raw body before any DB read.
- Dedupe via processed_events UNIQUE(event_id).
- Only then dispatch to the service.
Email outbox guaranteed-delivery
- Never sync-send inside a request handler.
- Insert into pending_emails in the triggering txn.
- Drain with capped backoff → DLQ on repeated failure.
Worker topology
No LLM agents at runtime — "orchestration" is the BullMQ worker topology. The API process enqueues jobs and never blocks a request on email/import. The worker process (separate container, same image) runs all processors plus the outbox/email relays on a fixed interval. SSE streams queue counters from the API, backed by a short-TTL Redis cache refreshed by the metrics job.
Data model
FKs scoped by org_id throughout. Authoritative DDL lives in db/migrations.
organizations
users role: manager | supervisor | coder · UNIQUE(lower(email)) [INV-3]
charts synthetic encounter · JSONB payload · specialty · difficulty · status · assignee
status: draft | pending_audit | completed | returned [ADR-016]
chart_codes ICD-10 dx/proc · rationale · sequence · is_principal
validation_results format / sequencing / comorbidity rule outcomes
audits supervisor review · score · decision
audit_findings structured findings on an audit
feedback training feedback to the coder
audit_log append-only · hash-chained (prev_hash + row_hash) [INV-12]
outbox_events transactional outbox for dual-write [INV-5]
processed_events inbound webhook idempotency · UNIQUE(event_id) [INV-4]
pending_emails email outbox (drain with backoff → DLQ)
import_jobs Synthea / FHIR import job tracking
code_reference ICD-10-CM seed (public-domain descriptors)
Key decisions (ADRs)
The choices that shaped the buildout, with the alternative that was rejected and why.
| Decision | Choice | Alternative | Why |
|---|---|---|---|
| Code set, v1 | ICD-10-CM only (procedures by code) | Full CPT descriptors | CPT is AMA license-restricted — bundling descriptors would breach the license ADR-002 |
| DB access | Raw pg + params + scope wrappers | Prisma / TypeORM | Keeps the tenant-scope boundary mechanically lintable ADR-003 |
| Real-time | SSE | WebSocket | Push is one-way (server→client); simpler over existing HTTPS ADR-004 |
| Dual-write | Transactional outbox + polling relay | Debezium CDC | No extra infra at medium scale ADR-005 |
| Password hash | argon2id (native) | bcryptjs | Native primitive, no substitution ADR-006 |
| Sessions | httpOnly cookie JWT (jose), CSRF-guarded | Header bearer tokens | Avoids JS-readable token theft ADR-007 |
| Deploy | Local Docker stack (cloud-portable) | Direct cloud | Externals unprovisioned at build time; gated + graceful ADR-008 |
| Monorepo PM | npm workspaces | pnpm | pnpm absent in build env; Node 22 + npm 10 present ADR-012 |
ADR-016 — fixed read-only coding panel
UI e2e exposed invented statuses (in_progress, unassigned) where the schema uses draft|pending_audit|completed|returned. canEdit never matched a coder's draft → permanent read-only panel. Fixed to isCoder && (draft|returned).
ADR-015 — ambiguous org_id 500
Charts-list query LEFT JOIN users; org_id on both tables made WHERE org_id = $1 ambiguous → Postgres error → 500 on the coder queue. Fixed by aliasing every filter column with c..
Architectural invariants
14 rules — 12 machine-checked by scripts/invariant-lint.mjs, 2 boundary-audited at the Drift Detection Gate. Changing a rule means amending §9 + invariants.json + an ADR together; never disabling one to pass a gate.
| ID | Rule | Check |
|---|---|---|
| INV-1 | No raw pg pool construction/use (getPool(/new Pool(/pool.query() outside apps/api/src/server/**. Modules use server-layer helpers + the txn client from withTransaction. | forbidden-pattern |
| INV-2 | A design-token file is the single source of color/type/spacing, copied verbatim from DESIGN-TEMPLATE. | required-file |
| INV-3 | users enforces unique (case-insensitive) email to prevent duplicate-account enumeration. | unique-constraint |
| INV-4 | Webhook processed_events has UNIQUE(event_id) so replays are idempotent. | unique-constraint |
| INV-5 | outbox_events table exists (transactional outbox for dual-write). | required-file |
| INV-6 | In every webhook controller, signature verification precedes any DB read. | boundary-order |
| INV-7 | Auth rate-limit guard precedes the password-hash compare in the login path. | boundary-order |
| INV-8 | No hard-coded secrets / placeholder credentials committed in source. | forbidden-pattern |
| INV-9 | .env.example exists and enumerates every env var. | required-file |
| INV-10 | UI coverage: every screen route renders, every non-internal endpoint is referenced by UI source, every workflow has an e2e spec to its terminal step. | ui-coverage |
| INV-11 | Health endpoint exists and reports dependency status. | required-file |
| INV-12 | Audit log is append-only & hash-chained; no UPDATE/DELETE on audit_log anywhere. | forbidden-pattern |
| INV-13 | Reverse proxy + Node header-buffer sizing present (nginx large_client_header_buffers / Node max-http-header-size). | manual |
| INV-14 | argon2 (not bcryptjs/argon2-browser) is the password hash primitive; no substitution. | forbidden-pattern |
Threat model
STRIDE per trust boundary — the mitigation is wired into the architecture, not bolted on.
| Boundary | Threat | Mitigation |
|---|---|---|
| Auth | Spoofing / repudiation | Anti-enumeration (uniform shape + timing), rate-limit fires before hash compare (INV-7), audit-logged |
| Chart access | Elevation / disclosure | Default-deny tenant + role guards; cross-tenant isolation tests |
| Audit log | Tampering | Append-only, hash-chained (prev_hash + row_hash), verified by a job (INV-12) |
| Webhook ingress | Spoofing / replay | HMAC verify-first + processed_events idempotency (INV-4, INV-6) |
| Outbox | Consistency / dual-write | Single-txn write + idempotent consumers (INV-5) |
| Secrets / at-rest | Disclosure | Secrets Manager / KMS, TLS-only, encrypted Postgres volume |
Security Audit Gate (ADR-014)
AUTH-1 (critical, fixed) — registration with an existing email no longer authenticates as the existing account; it returns a shape-identical 201 with no session, preserving anti-enumeration without account takeover. CFG-1 (fixed) — assertProdSecrets() blocks production boot on weak/missing JWT_SECRET/WEBHOOK_SIGNING_SECRET. DEP-1 (accepted) — transitive postcss XSS is dev-toolchain only, not runtime-reachable; tracked for the next Next.js bump.
Buildout waves
Built as a single orchestrated run across 5 dependency-ordered waves (ADR-011) — one consistent contract layer authored centrally, each feature run as its own iteration.
- F-01 scaffold, config, health, migrations
- F-02 tokens, app shell, landing
- packages/shared + packages/tokens
- F-03 Auth + RBAC
- F-04 Users + seed
- F-05 Synthea import
- F-06 Queue + assign
- F-07 Chart + coding panel
- F-08 Validation engine
- F-09 Audit + feedback
- F-10 Metrics / SSE / export
- F-11 Notifications · F-12 Audit-log
- F-13 Outbox + webhooks
- F-14 invariant-lint + smoke + Claude scaffold
- lint:invariants must pass
- npm test validation engine + signatures
- smoke-test.sh full workflow matrix
- Playwright e2e per workflow
Deployment
One docker compose up brings the whole stack to https://localhost.
| Service | Image | Role |
|---|---|---|
| db | postgres:17-alpine | Database (encrypted volume pgdata) |
| redis | redis:7-alpine | Queue + SSE cache |
| api | (build) | NestJS REST + SSE |
| worker | (same image) | import · outbox · email · metrics |
| web | (build) | Next.js SSR |
| nginx | nginx:1.27-alpine | TLS termination / reverse proxy |
npm install npm run build -w @smartcoders/shared bash scripts/gen-certs.sh docker compose up -d --build # → https://localhost bash scripts/smoke-test.sh
Required env: DATABASE_URL, REDIS_URL, JWT_SECRET, SESSION_COOKIE_NAME, APP_URL, WEBHOOK_SIGNING_SECRET. Optional/gated: RESEND_API_KEY, SENTRY_DSN, DD_*, AWS_*. Every var is enumerated in .env.example.