Plan: CRM Conformance Suite

Why

Atomo's README calls the CRM the flagship app that "drives platform evolution," but until recently nothing enforced that — every Rust integration test used a synthetic 2–3 field schema. The first test that ran the real services/crm-service/schema.ts through the platform (crates/atomo/tests/crm_dogfood.rs) immediately surfaced four silent bugs that toy schemas could never reach (enum→JSONB, array NOT NULL, validation regex only matching single quotes, and validation never enforced in the data layer).

This plan turns the CRM from a demo into an executable specification: a conformance suite of integration tests, all driven by the real schema.ts, that systematically walk Atomo's capability surface. If a platform change breaks the flagship, a test goes red.

Framing (honest): the CRM can validate most of the backend, but not all. A few capabilities (multi-tenant, OAuth, CLI, SDK offline sync) need their own harnesses because the CRM can't naturally express them. So: CRM as the primary conformance driver, plus targeted supplementary harnesses for what it structurally can't reach.

Progress (as of 2026-05-31)

Phases A, SEC, B, C, and D are complete. The conformance pass is done.

CI-verified (run 26718854343): the manual workflow_dispatch CI ran the whole suite in a clean environment against pgvector/pgvector:pg16 — Test Suite ✅ (10m18s: cargo test --workspace + DB-gated --ignored --test-threads=1) and macOS Build ✅. The only failure was Linting (cargo fmt --all --check) from accumulated formatting drift — fixed (461ac9f). Because CI has pgvector, the AI/pgvector path (D2) is exercised there (the local infra block doesn't apply in CI).

Outcome so far — 7 silent gaps fixed, 2 security holes closed, 2 capabilities verified already-working, all driven by the real CRM schema:

Silent gap fixed	Phase	Layer
enum field → JSONB column	(dogfood)	codegen
array field `NOT NULL` no default	(dogfood)	codegen
validation regex single-quote-only	(dogfood)	parse
validation never enforced in data layer	(dogfood)	data
explicit `tableName` ignored	A1	codegen
RBAC access rules never parsed → allow-all	S1	parse
projection: delete never removed rows	B2	runtime
projection: numeric stored as `""`	B2	runtime
pagination cache-key collision (page 2 == page 1)	C2	runtime

Security holes closed: unauthenticated /graphql/ws (S2), tenant scoping non-functional (S3). Verified already-working: audit (B4), relationship include (C1).

Key insight (now addressed): the bugs clustered in the parse/codegen layer — anything reading a declaration from the export const schema metadata (validation, RBAC, tableName, and the latent relationships gap) was broken the same way, because each feature had its own fragile brace-walk pass. Resolved: a single parse_model_metadata now extracts tableName / validation / access / relationships from each model block via one shared brace-balanced sub_block + top_level_entries helper. The three duplicate passes were deleted; relationship resolution is now schema-driven (the C1 latent gap fell out for free). This kills the recurring bug class at the source.

Deferred backlog (each documented inline below): data-layer RBAC auto-enforcement (the client.enforce_access(model, action, role) seam now exists + is tested, but isn't yet called automatically on every create/update/delete — that needs a role threaded into the data layer; callers can enforce on demand today).

Recently closed (this pass): SDK SubscriptionBuilder filtering; S3a subscription tenant-filter; B2a projection rebuild-replay; item3 workflow Mutation step executes via an injected GraphQL executor; item6 OAuth token round-trip tested against a mock IdP; item5/S3b per-user tenant binding + header validation; item4 RBAC unchecked variants documented as the trusted/system API (request path goes through *_checked).

Remaining backlog, with the honest blocker for each (these are not minimal-code items):

RBAC — ✅ data-layer enforcement landed (*_checked); unchecked variants are the documented trusted/system API (seeding/migrations). No further lockdown without breaking legit system use.
Workflow JS steps — Mutation now runs via injected executor (item3); Plugin step was removed [Note: WASM plugin system was removed; replaced by actions & workers]. JS steps (sales-pipeline.yml) still need a JS step runtime.
S3c event-store tenant scoping + PG row-level-security — design fork: RLS needs a per-connection SET app.tenant_id read by CREATE POLICY, but the shared pool means a half-implementation could leak across pooled connections; safe impl = per-tx set/reset + generated policies. It's defense-in-depth atop the working app-layer scoping (S3/S3a/S3b).
AI/pgvector (D2) — needs the pgvector extension + an embedding provider; not available locally but runs in CI (the CI Postgres is pgvector/pgvector, and the Test Suite there is green). A dedicated CRM-driven AI assertion (embed Contact notes → semantic search) is still worth adding, but the path is no longer unverified.
OAuth token round-trip (D3) — needs a mock IdP; the authorize-URL slice is tested.
CLI migrate/codegen smoke (D4) — heavier (DB / full parser); init is smoke-tested.
exists: validation rule — deferred to FK constraints (DB enforces referential integrity).
LOW-risk cache polish (find_unique uncached, no eviction, Debug keys).

Status legend

✅ conformance-tested via CRM — proven against the real schema
🟡 synthetic-only — an integration test exists, but on a toy schema, not the CRM
🔴 GAP — investigated and found broken/silently dropping a real-schema declaration
🔬 read-only — code read/exists, but no integration test (treat as "unverified")
❌ no test

Coverage map

Capability	CRM can drive?	Status	Notes
Schema→codegen→migrations	yes	✅	dogfood fixed enum/array; A1 honors `tableName`; enums still emit junk tables (pre-existing, noted)
CRUD	yes	✅	`crm_dogfood` + `integration_test`
Validation rules	yes	✅	data-layer enforced on create + update (update-aware via `validate_partial`); `exists:` deferred to FKs
Relationships (belongsTo/hasMany)	yes	✅	C1: `include` resolves contact.company + contact.deals (dogfood). Now schema-driven — `resolve_includes` reads the declared `relationships` block (unified parser), so a rel whose name ≠ model resolves too (`schema_driven_include_resolves_renamed_relationship`)
Soft delete / restore / hard delete	yes	✅	C2: full delete→trash→restore lifecycle via CRM Deals
Pagination + where/orderBy	yes	✅ fixed	C2: orderBy+limit+offset via CRM; fixed cache-key collision (page 2 returned page 1)
Event sourcing + replay	yes	✅	C3: Deal Created→Updated→Updated→Deleted reconstructs via `entity_history` (`crm_deal_event_history_replays`); confirms B2 delete-event id fix
GraphQL resolvers	yes	🟡	`http_e2e`, synthetic
Subscriptions (WebSocket)	yes	✅	S2 auth (connection_init JWT + read-gating) + S3a tenant-filter; SDK `SubscriptionBuilder` now filters by model+event-type (`stream_filters_by_model_and_event_type`)
RBAC enforcement	yes	✅	S1 parse + decide seam; data-layer enforced via `*_checked` (GraphQL routes through them). Unchecked variants = documented trusted/system API (item4)
Audit logging	yes	✅	B4: model-agnostic listener works through CRM models (`test_crm_mutation_audited_with_actor`) — already worked, no fix
Workflows	yes	🟡 mostly	B1 YAML + HTTP step; item3: `Mutation` step now runs via injected GraphQL executor (`mutation_step_runs_via_injected_executor`). Plugin step removed; JS steps still placeholders
Actions & workers	yes	✅	action dispatcher, worker CRUD API, typed SDK, lifecycle test
Caching (TTL + invalidation)	yes	✅	C4: populate + invalidate-on-create confirmed via CRM (dogfood 7b). LOW polish deferred (find_unique uncached, no eviction)
CQRS projections / aggregate	yes	✅	B2: Deleted removes rows; non-string columns via `value_to_text`; B2a rebuild now replays from `event_log` (`projection_correctness` 2 tests)
AI / pgvector	partial	✅ CI	D2/item5: `crm_ai.rs` embeds Contact notes → cosine search ranks nearest (needs pgvector; runs in CI, `#[ignore]` locally)
Multi-tenant (RLS)	yes	🟡 core+	S3+D1+S3a+S3b: tenant_id col; read/write/subscription scoping; per-user binding (users.tenant_id) + header validation (`tenant_header_validation`). S3c PG-RLS = documented design fork
OAuth/OIDC	no (needs mock IdP)	✅	D3 + item6: authorize-URL unit-tested; token round-trip (exchange_code → get_user_info) tested against a mock IdP (`token_round_trip_against_mock_idp`)
Rate limiting	infra	✅	`middleware.rs`
CLI (init/dev/migrate/codegen)	no (process-level)	🟡 init	D4: `init` scaffold smoke-tested via the built binary; migrate/codegen/dev deferred (heavier)
SDK offline queue/sync	no (client harness)	❌	types only
Admin UI	via E2E	🟡	Playwright (timeline, kanban) — may use demo fallback

Investigation findings (2026-05-31, parallel discovery + spot-verified)

Read-only investigation of the 5 biggest unknowns. Every capability probed has at least one HIGH-risk silent gap — same pattern as the dogfood bugs: the platform parses/accepts a schema declaration, then silently drops/skips/mismaps it. Two findings spot-verified by direct read (RBAC parse gap, tenant_id column gap); the rest are subagent reports with file:line and should be reconfirmed by the conformance test that targets them.

Ranked by risk × correctness/security impact:

#	Capability	Worst gap (file:line)	Class	Risk
1	RBAC	access rules never parsed from `export const schema`; `Model.access` always `None`; `check_access` defaults allow-all (`graphql.rs:49-53`)	SECURITY	🔴 HIGH
2	Subscriptions auth	`/graphql/ws` mounted with no auth middleware (`handlers.rs:253`)	SECURITY	🔴 HIGH
3	Multi-tenant	no `tenant_id` column generated (`schema.rs:29-75`); reads/writes fail or leak; no RLS; no header→user check	SECURITY	🔴 HIGH
4	Workflows	YAML never loaded (`lib.rs:148`), schema mismatch, steps are no-ops (`workflow.rs:230-260`)	CORRECTNESS (facade)	🔴 HIGH
5	Projections	Deleted never removes rows; numeric→`""`; rebuild = truncate-no-replay	CORRECTNESS (data)	🔴 HIGH
6	Cache	find_unique uncached, no eviction, Debug-format keys	PERF	🟢 LOW

Three are SECURITY holes (#1 RBAC bypass, #2 unauth WebSocket, #3 tenant bypass/leak) — any authenticated (or for #2, unauthenticated) client can read/modify all data. These jump the queue. Two are CORRECTNESS holes (#4 workflows are facade, #5 projections silently corrupt the read model). The shared root cause of #1 (and the earlier validation bug) is the same parser gap: only the defineModel DSL format is parsed for access/validation, not the export const schema format the real CRM (and the docs' own examples) use.

Phases

Each phase grows crm_dogfood (or sibling CRM-driven tests) and ends with the platform demonstrably running its flagship for that capability. These are bug-fix phases, not just test-add phases — the investigation proved the features don't work, so fixing what the test targets is the bulk of the work.

Phase A — Unblocker (must go first, solo)

[x] A1. Honor explicit tableName (✅ done — Model.table_name, parsed via parse_table_names, used in sql_builder::table_name + migrations; falls back to pluralized name). crm_dogfood now creates company/contact/deal (not companys). tableName parsing is covered by the parses_and_enforces_access_rules unit test (its fixture sets tableName: "contact").
- Migration-drift assessment: the 7 hand-written CRM migrations are stale artifacts of an older, buggier codegen — they predate this session's fixes. With tableName honored the table names now match (contact/company/deal), but the hand-written SQL also has: version INTEGER + per-table indexes (platform generates neither), no deleted_at (so soft-delete would break on them), enum-as-table junk (companysize with _enum_value_0..5), and UUID ids on block tables vs TEXT on core. Recommendation: retire the hand-written migrations in favor of platform-generated ones rather than reconcile column-by-column — but that's a separate, riskier change to the CRM service's migration history (deferred, not done).

Phase SEC — Security holes (jumped the queue; do right after A1)

[x] S1. RBAC (✅ GraphQL path done): parse_access_rules extracts access from the export-const-schema format (the bug — only defineModel was parsed); attached in a sixth parse pass. New shared seam AccessControl::decide(action, role) -> AccessDecision (in atomo_schema) handles public/authenticated/pipe-OR; graphql.rs::check_access refactored onto it. Tests: parses_and_enforces_access_rules (unit) + test_rbac_viewer_denied_create_admin_allowed (e2e: viewer denied, admin allowed). CAVEAT: data-layer client.create/update/delete does NOT yet enforce — it has no role context (only actor user_id); the decide() seam is shared and ready, but plumbing role through the data-layer API is a follow-up. GraphQL is the external boundary, so the API-level bypass is closed; direct SDK/internal callers still bypass.
[x] S2. WebSocket auth (✅ done): /graphql/ws now routes to an authenticated handler (graphql_ws_handler) that verifies a JWT from the connection_init payload ({"authorization":"Bearer <jwt>"} / bare token) and injects UserRoleCtx/UserIdCtx — rejects missing/invalid tokens. Second layer: model_changes resolver now takes ctx and gates by the model's read rule via AccessControl::decide (errors on Forbidden/NeedsAuth). Test: test_subscription_requires_auth_role (no role → rejected; role → stream stays open).
[x] S3. Multi-tenant — core done (✅): generate_migrations now emits a nullable tenant_id TEXT column on every table, so the pre-existing scope_by_tenant (reads) + create-resolver injection (writes) finally work — they failed before because the column never existed. Nullable = backward-compatible for single-tenant (no TenantCtx → NULL, no scoping). x-tenant-id is now only honored for authenticated requests (was: anyone could claim any tenant). Test: test_two_tenant_isolation (A and B each see only their own rows). Deferred (documented, not done):
- [x] S3a. Subscription tenant-filtering (✅ done): model_changes now filters events by the subscriber's TenantCtx (injected from the WS connection_init tenant field); None = unscoped. Cross-tenant real-time leak closed.
- [ ] S3b. Per-user tenant binding — there is no tenant_id on users to validate the header against, so a user can still claim any tenant (just not anonymously). Real validation needs a user→tenant data model (users.tenant_id + JWT claim). Substantial; separate feature.
- [ ] S3c. Event-store + PG RLS — events carry no tenant; no row-level-security policies generated (defense-in-depth beyond app-layer WHERE).

Phase B — Correctness holes

[~] B1. Workflows — engine fixed, CRM yml deferred (partial): YAML loading added (load_workflows now parses .json/.yaml/.yml into the Workflow struct via serde); the Http step now actually performs the request (was a no-op log) and records http_status. Tests: deal_update_event_finds_workflow (trigger wiring) + http_step_actually_sends_request (real HTTP to a local listener). DEFERRED — the CRM's own sales-pipeline.yml still cannot run: its steps are inline JavaScript (await sendNotification(...), throw new Error(...)) with type: validation|action|data_transformation — a shape the engine has no execution model for. Making it run needs a JS step runtime; this is a standalone feature not dependent on the removed plugin system [Note: WASM plugin system was removed; replaced by actions & workers].
- [~] B1a. Mutation step now runs via injected executor (was silent no-op "success" — a facade). Plugin step was removed. JS step runtime still TODO.
- [ ] B1b. A JS-step execution model so the CRM's literal sales-pipeline.yml runs.
[~] B2. Projections — corruption fixed, rebuild deferred: (1) ✅ Deleted now removes the projection row — soft_delete gained RETURNING id and delete_many emits a Deleted event per affected id (was empty data → row never removed). (2) ✅ non-string columns stored correctly — projection binds via value_to_text (was as_str().unwrap_or_default() → numerics became ""). Test: projection_correctness (numeric value stored as "50000"; delete removes the row).
- [x] B2a. Rebuild now replays (✅ done): TableProjection::rebuild TRUNCATEs then replays the model's events from event_log via the same handle_event path (test rebuild_replays_from_event_log). Was truncate-only (silently emptied the read model).
[x] B3. Update-aware validation (✅ done): validate_partial only checks rules for fields present in the patch, enforced in update_many after before_update. A stage-only update no longer trips title: required, but setting title: "" is still rejected. Tests: 3 unit (partial_update_*, full_validate_still_requires_absent_field) + dogfood partial-update assertion. exists:<table>,<col> stays a documented no-op — referential integrity is the DB's job (FK constraints); a sync validator can't query, and an async pass would duplicate the FK.
[x] B4. Audit-on-CRM-mutation (✅ done — already worked): the boot audit listener is model-agnostic (subscribes to the event stream, records any model_name with the actor), so it handles CRM models correctly with no fix needed. test_crm_mutation_audited_with_actor proves a Contact create + update are both audited with op + actor sales-7. First capability that was not silently broken — only needed CRM-driven proof.

Phase C — Data-pipeline polish (CRM-native, lower risk)

[x] C1. Relationship resolution (✅ works for CRM): resolve_includes resolves both contact.company (belongsTo) and contact.deals (hasMany) as nested objects/arrays. Proven in crm_dogfood (step 5b). Latent gap (documented, not fixed): resolution is convention-based — it infers the related model from the relationship name ({rel}Id → capitalize(rel)), NOT from the schema's declared relationships block ({type, model, foreignKey}). The CRM works only because its relationship names align with model names; a relationship whose name differs from its target model (e.g. owner: { model: "User" }) would resolve to the wrong/nonexistent model. Fixing = make resolve_includes read the relationships block (needs parsing it from the export-const-schema first — same parser-format family as access/validation). Deferred: no CRM-visible payoff.
[x] C2. Soft-delete/restore/pagination/orderBy via CRM (✅ done — found+fixed a real bug): the dogfood now exercises orderBy(value DESC)+limit+offset and the full soft-delete→trash→restore lifecycle on Deals. Bug caught: the find_many cache key was {where}{orderBy} and omitted limit/offset, so two queries differing only in pagination collided — page 2 returned page 1's rows. A silent correctness hazard for every paginated view (Kanban, lists). Fixed: key now includes limit+offset. (Soft-delete/restore/orderBy themselves worked.)
[x] C3. Event sourcing + replay (✅ done — works): crm_deal_event_history_replays drives a Deal Created → Updated → Updated → Deleted and reconstructs it exactly via EventStore::entity_history. No fix needed — but it validates the B2 delete fix in a second context: entity_history filters by data->>'id', so before B2 (empty delete events) the Deleted event would have been invisible to history. replay/entity_history themselves worked. (Note: this is event log/history, distinct from projection rebuild-replay, still deferred B2a.)
[x] C4. Cache conformance (✅ done): find_many populates the read cache and a create invalidates it — the next identical query returns fresh rows incl. the new Deal (dogfood step 7b). No fix needed (the real cache bug was the C2 pagination-key collision, already fixed). Deferred LOW-risk polish: find_unique uncached, no background eviction, Debug-format keys.

Phase D — Supplementary harnesses (what CRM can't reach alone)

[x] D1. Multi-tenant isolation harness (✅ done): test_two_tenant_isolation now also asserts tenant B's update-all and delete-all only touch B's rows — tenant A's note survives unmodified. Read AND write scoping proven.
[~] D2. AI/pgvector — INFRA-BLOCKED (documented, not testable here): pg_available_extensions shows no vector extension locally, and EmbeddingStore::init does CREATE EXTENSION ... vector with an embedding vector(1536) column; real embeddings also need an OpenAI key. Requires pgvector installed + an embedding provider. Deferred to an environment that has them rather than a hollow test.
[x] D3. OAuth — infra-free slice done (partial): authorize_url_contains_required_params unit test proves the authorization URL carries client_id / response_type=code / CSRF state / encoded redirect+scopes. Full token round-trip (code→token→userinfo) needs a mock IdP — deferred.
[x] D4. CLI smoke test (✅ done): cli_smoke.rs invokes the built atomo-cli binary's init my-app --template crm and asserts the scaffold (atomo/schema.ts with a CRM model, package.json). migrate/codegen smoke are heavier (DB / full parser) — deferred.

Cross-cutting (do alongside, not after)

[x] CI: the DB-gated suite runs in the manual workflow_dispatch job as the "CRM Conformance release gate" (--ignored --test-threads=1); auto-triggers stay off for cost. Also fixed the release artifact paths (atomo → atomo-cli).
[x] Roadmap honesty: roadmap.md Status Overview + README Phase 2 corrected — RBAC (GraphQL-only), multi-tenant/workflows/AI downgraded to 🟡/[~] with a conformance-status note.

Gaps found by the conformance pass — and how they were resolved

This section is a historical record: what the investigation found broken, and the fix. (Originally written present-tense as open gaps; updated as each was closed.)

SECURITY: RBAC was fully bypassed — access rules were never parsed from export const schema (only the defineModel DSL form), so every model defaulted to allow-all. ✅ Fixed in S1 (parse_access_rules + shared AccessControl::decide). Data-layer auto-enforcement still TODO.
SECURITY: WebSocket /graphql/ws was unauthenticated — anyone could subscribe to all model changes. ✅ Fixed in S2 (connection_init JWT + read-gating).
SECURITY: multi-tenant was non-functional + leaky — no tenant_id column was generated; subscriptions leaked; header unvalidated. ✅ Fixed in S3/S3a/D1 (column generated; read/write/subscription scoping; auth-gated header). Per-user binding + PG-RLS still TODO.
Workflows were a facade — CRM's sales-pipeline.yml never loaded; steps were no-ops. 🟡 Partially fixed in B1 (YAML loads; HTTP step executes). JS-step workflows still TODO.
Projections silently corrupted — deletes never removed rows; numeric fields → ""; rebuild lost data. ✅ Fixed in B2/B2a (delete-event ids; value_to_text; rebuild replays).
tableName was ignored → company became companys; drifted from hand-written migrations. ✅ Fixed in A1 (honored). Retiring the stale hand-written migrations is the remaining cleanup.
Validation wasn't enforced on update (partial updates would wrongly trip required). ✅ Fixed in B3 (validate_partial). exists: stays a documented no-op (FKs cover it).
GraphQL keeps its own inline validation copy; the data layer now also validates (harmless dup; consolidate eventually).

Caveats / cost

DB-gated tests are slow (~20s each with fuel-metered plugins); a full run is minutes. Keep it manual-dispatch, not per-push.
http_e2e tests share one atomo_test DB and FAIL under parallel execution (they seed users / create tables and clobber each other) — run with --test-threads=1. Same shared-DB-singleton constraint that prevents parallel implementation. Worth fixing with per-test DBs/schemas eventually.
Disk is finite; watch target/ size.
This is a multi-week effort — correct if the goal is a trustworthy platform; the wrong call if the near-term goal is shipping features fast. That's a product decision.
Findings are mostly subagent reports with file:line; RBAC + tenant_id were spot-verified by direct read. Reconfirm each via its conformance test before trusting — a couple may be partially inaccurate. Do not treat "implemented" as "working" until a test says so.
The docs/roadmap currently claim several of these as "✅ implemented" / "✅ completed" — those claims are misleading and should be corrected as each is fixed+tested.

Plan: CRM Conformance Suite ​

Why ​

Progress (as of 2026-05-31) ​

Status legend ​

Coverage map ​

Investigation findings (2026-05-31, parallel discovery + spot-verified) ​

Phases ​

Phase A — Unblocker (must go first, solo) ​

Phase SEC — Security holes (jumped the queue; do right after A1) ​

Phase B — Correctness holes ​

Phase C — Data-pipeline polish (CRM-native, lower risk) ​

Phase D — Supplementary harnesses (what CRM can't reach alone) ​

Cross-cutting (do alongside, not after) ​

Gaps found by the conformance pass — and how they were resolved ​

Caveats / cost ​