morethanadiagnosis-hub/openspec/changes/2025-11-17-data-model-v1/proposal.md
Claude 0afa81280e
feat(openspec): propose foundational infrastructure specs
Add three critical infrastructure proposals for implementation readiness:

1. Data Model v1 (Consolidated Schema)
   - All entities from approved features (User, Profile, Forum, Blog, etc.)
   - Field-level data classification (Public/PII/PHI)
   - Relationships, indexes, and retention policies
   - DSR support and migration strategy
   - Target: openspec/specs/data-model.md

2. Authentication & Authorization System
   - OAuth2/OIDC with PKCE for secure auth
   - RBAC (member, moderator, admin roles)
   - Pseudonym support for privacy
   - MFA (TOTP), password reset, account lockout
   - Audit logging for compliance
   - Target: openspec/specs/architecture.md

3. Design System & Component Library
   - Unified components for Android/iOS/Web parity
   - WCAG 2.2 AA+ accessibility built-in
   - Design tokens (colors, typography, spacing)
   - Theming (light/dark/high contrast)
   - Platform-specific adaptations (RN, Next.js)
   - Target: openspec/specs/architecture.md

These proposals unblock implementation work across all features.

/review areas=backend,security,compliance,accessibility,mobile,web,design
2025-11-18 00:17:49 +00:00

6.7 KiB

Proposal: Data Model v1 (Consolidated Schema)

Status: draft Authors: Architecture Team, Data Team Owners: Architecture Lead, Data Lead, Compliance Lead Created: 2025-11-17 Scope: spec Related: openspec/specs/data-model.md

Summary

  • Consolidate all entity schemas from approved feature specs into a unified data model with field-level data classification, relationships, and migration strategy.

Motivation

  • Ensure consistent data modeling across all features before implementation begins.
  • Establish clear PHI/PII boundaries and retention policies at the schema level.
  • Enable efficient backend development with well-defined entities and relationships.

Goals / Non-Goals

  • Goals: consolidated entity schemas, field-level data classes (Public/PII/PHI), relationships/foreign keys, indexing strategy, retention/soft-delete rules, migration versioning.
  • Non-Goals: vendor-specific schema syntax (use portable DDL concepts); performance tuning (covered in implementation).

Requirements

Functional

  • All entities from approved specs: User, Profile, ForumCategory, ForumThread, ForumPost, ForumReaction, ForumReport, BlogPost, PodcastEpisode, TributeEntry, Resource, MerchProduct, etc.
  • Relationships clearly defined with foreign keys and cascade rules.
  • Indexes for common query patterns (user lookups, thread pagination, search, etc.).

Privacy & Compliance

  • Every field tagged with data class: Public, PII, or PHI.
  • PHI fields isolated where possible; encryption requirements noted.
  • Retention policies per entity (e.g., soft-delete window, hard-delete rules).
  • DSR support: export and delete operations mapped to entities.

Data Model

Core Entities

  • User: id, email (PII), created_at, updated_at, deleted_at
  • Profile: id, user_id (FK), display_name, pseudonym, pronouns, avatar_url, bio, health_journey (PHI, private by default), consent_flags, created_at, updated_at
  • ForumCategory: id, name, description, order, created_at
  • ForumThread: id, category_id (FK), author_id (FK User), title, pinned, locked, created_at, updated_at
  • ForumPost: id, thread_id (FK), author_id (FK User), parent_post_id (FK ForumPost, nullable), content (may contain PHI), deleted_at, created_at, updated_at
  • ForumReaction: id, post_id (FK), user_id (FK), emoji_code, created_at
  • ForumReport: id, post_id (FK), reporter_id (FK User), reason, status, moderator_notes, resolved_at, created_at
  • BlogPost: id, author_id (FK User), title, slug, content, published_at, created_at, updated_at
  • PodcastEpisode: id, title, description, audio_url, duration, published_at, created_at
  • TributeEntry: id, author_id (FK User), subject_name, memorial_text (may contain PHI), published, created_at, updated_at
  • Resource: id, title, slug, content, access_tier (public/members), tags, created_at, updated_at
  • MerchProduct: id, name, description, price, stock_count, created_at, updated_at
  • Order: id, user_id (FK), total, status, shipping_address (PII), created_at, updated_at
  • Consent: id, user_id (FK), consent_type, granted, granted_at, revoked_at

Relationships

  • User → Profile (1:1, cascade delete)
  • User → ForumPost (1:N, soft-delete user → anonymize posts)
  • User → ForumThread (1:N)
  • ForumCategory → ForumThread (1:N)
  • ForumThread → ForumPost (1:N, cascade delete)
  • ForumPost → ForumReaction (1:N, cascade delete)
  • ForumPost → ForumReport (1:N)
  • User → BlogPost (1:N)
  • User → TributeEntry (1:N)
  • User → Order (1:N)

Data Classification Summary

  • Public: ForumCategory, PodcastEpisode, Resource (public tier), MerchProduct, BlogPost (published)
  • PII: User.email, Profile.display_name, Order.shipping_address, Profile.avatar_url
  • PHI: Profile.health_journey, ForumPost.content (context-dependent), TributeEntry.memorial_text (context-dependent)

Indexing Strategy

  • User: email (unique), created_at
  • Profile: user_id (unique FK)
  • ForumThread: category_id, author_id, created_at, updated_at
  • ForumPost: thread_id, author_id, created_at
  • BlogPost: slug (unique), author_id, published_at
  • Resource: slug (unique), access_tier, tags (GIN/array index)
  • Order: user_id, created_at

Retention & Soft-Delete

  • User: soft-delete (deleted_at); 90-day window before hard-delete; anonymize posts.
  • ForumPost: soft-delete (deleted_at); 90-day window; on user delete → replace author with "[deleted]".
  • BlogPost, TributeEntry: indefinite retention unless user requests DSR delete.
  • Order: 7-year retention for compliance (tax/commerce), then hard-delete.

Migration Strategy

  • Versioned migrations (e.g., Alembic, Flyway, or similar).
  • Idempotent scripts for rollback safety.
  • Seed data for initial categories, default consents, sample resources.

Security & Threat Model

  • Encryption at rest: PII/PHI fields encrypted at database level or app level.
  • Access controls: RBAC enforced at API layer; row-level security (RLS) for multi-tenancy if needed.
  • Audit logging: all mutations on PHI/PII entities logged (excluding PHI content itself).

Observability & Telemetry

  • Schema change tracking in migration logs.
  • Query performance metrics for critical paths (thread list, user profile fetch).

Test Plan

  • Unit tests for schema constraints (foreign keys, unique indexes).
  • Integration tests for cascade deletes and soft-delete behavior.
  • DSR export/delete tests: verify all user data is captured or purged.
  • Retention tests: simulate time-based purges.

Migration / Rollout Plan

  • Deploy schema migrations to staging; run dry-run seeds.
  • Validate with read-only queries before enabling writes.
  • Rollback plan: revert migration, restore from backup if needed.

Risks & Mitigations

  • Schema drift across features → enforce single source of truth in data-model.md.
  • PHI leakage → code reviews and automated scanning for PHI in logs.

Alternatives Considered

  • NoSQL vs relational: choosing relational (Postgres) for strong consistency, relationships, and mature tooling.
  • Schema-per-feature vs unified: choosing unified for easier joins and data integrity.

Work Breakdown

  1. Document all entity schemas in data-model.md
  2. Generate migration scripts from schema
  3. Seed test data for development
  4. Validate DSR export/delete coverage
  5. Review with compliance and security teams

Acceptance Criteria

  • All approved feature entities documented in data-model.md with field-level data classes.
  • Relationships and indexes defined.
  • Retention and soft-delete rules specified.
  • DSR export/delete coverage verified.
  • Sign-off from compliance and security teams.

Open Questions

  • Specific database vendor (Postgres assumed, but could be vendor-specific features)?
  • Multi-region replication strategy (future proposal)?
  • Real-time sync requirements for forum (WebSocket state management)?

Slash Commands

  • /review areas=backend,compliance,security,data
  • /apply spec=openspec/specs/data-model.md
  • /archive link=<PR>