Skip to content

Data Integrity System

The Data Integrity System is an automated framework for detecting and fixing data inconsistencies in the Convex database. It validates denormalized fields, relational consistency, and calculated values across 12+ tables.

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                    INTEGRITY AUDIT SYSTEM                           │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│  ┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐  │
│  │   Validators     │  │    Checkers      │  │  Fix Mutations   │  │
│  │   (lib/)         │  │    (checks/)     │  │    (fixes/)      │  │
│  │                  │  │                  │  │                  │  │
│  │ - Milestone      │  │ - run actions    │  │ - milestoneXyz   │  │
│  │ - OBS Cascade    │  │   per check      │  │ - obsCascade     │  │
│  │ - Time Sum       │  │ - fetch & run    │  │ - commitmentSum  │  │
│  │ - Name Sync      │  │   validators     │  │ - nameConsistency│  │
│  └──────────────────┘  └──────────────────┘  └──────────────────┘  │
│                                │                                    │
│                                ▼                                    │
│                        ┌──────────────────┐                         │
│                        │     Runner       │                         │
│                        │   (runner.ts)    │                         │
│                        │                  │                         │
│                        │ - runAll         │                         │
│                        │ - runCheck       │                         │
│                        │ - runFix         │                         │
│                        │ - runCheckAndFix │                         │
│                        └──────────────────┘                         │
│                                │                                    │
│         ┌──────────────────────┼──────────────────────┐            │
│         ▼                      ▼                      ▼            │
│  ┌─────────────┐       ┌──────────────┐      ┌────────────┐        │
│  │   Store     │       │   Discord    │      │  Cron Job  │        │
│  │   Reports   │       │   Alerts     │      │  (daily)   │        │
│  └─────────────┘       └──────────────┘      └────────────┘        │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

File Structure

packages/convex/integrityAudit/
├── index.ts                    # Main exports
├── runner.ts                   # Orchestration (runAll, runCheck, runFix)
├── discord.ts                  # Discord bot API integration
├── queries.ts                  # Internal queries for data fetching
├── mutations.ts                # Mutation handlers
├── checks/
│   ├── utils.ts                # Helper: fetchAllPages, auditQueries
│   ├── milestoneConsistency.ts
│   ├── obsCascade.ts
│   ├── commitmentTimeSum.ts
│   ├── totalTimeConsistency.ts
│   ├── designElementNameConsistency.ts
│   ├── coordinatorNameConsistency.ts
│   ├── assigneeNameConsistency.ts
│   └── issueNameConsistency.ts
└── fixes/
    ├── types.ts                # FixResult types
    ├── milestoneConsistencyFix.ts
    ├── obsCascadeFix.ts
    ├── commitmentTimeSumFix.ts
    └── *NameFix.ts             # Name sync fixes

Validators

Core Validators

ValidatorWhat it ValidatesError Type
Milestone Consistencydods.milestoneNumber matches issues.milestoneNumbermilestone_mismatch
OBS CascadeWhen issue is OBS, related dodNotes.isObsolete must be trueobs_cascade_missing, obs_cascade_orphan
Commitment Time SumSum of dodNotes.estimateTime equals assigneeReservations.currentCommitmentTimecommitment_sum_mismatch
Total Time ConsistencyCalculated total times match expected valuestotal_time_mismatch

Name Consistency Validators

ValidatorTables AffectedField Synced
Design Element Nameissues, solutions, dods, assigneeReservations, milestones, dodNotifications, dodNotesdesignElementName
Coordinator Nameissues, solutions, milestones, assigneeReservationscoordinatorName
Assignee Namedods, dodNotesassigneeUserName, assigneeName
Issue Namedods, solutions, dodNotesissueName

Running Checks

Run All Checks

typescript
await ctx.runAction(api.integrityAudit.runner.runAll, {
  milestoneNumber: 74,        // Optional: scope to milestone
  sendDiscordAlert: true,     // Optional: notify Discord
});

Run Single Check

typescript
await ctx.runAction(api.integrityAudit.runner.runCheck, {
  checkName: "milestoneConsistency",
  milestoneNumber: 74,
});

Run Check with Auto-Fix

typescript
// Step 1: Dry run (simulate)
const { checkResult, fixResult } = await ctx.runAction(
  api.integrityAudit.runner.runCheckAndFix,
  {
    validator: "designElementNameConsistency",
    autoFix: true,
    dryRun: true,  // Simulate only
  }
);

// Step 2: Review results, then apply
const { fixResult: actual } = await ctx.runAction(
  api.integrityAudit.runner.runCheckAndFix,
  {
    validator: "designElementNameConsistency",
    autoFix: true,
    dryRun: false,  // Apply fixes
  }
);

Fix Mutations

Dry-Run Pattern

All fixes follow a dry-run-first pattern:

  1. Run fix with dryRun: true to simulate
  2. Review results (fixed, failed, skipped counts)
  3. Run with dryRun: false to apply

System Attribution

All fixes are marked with a system user for audit trail:

typescript
// Patches include:
{
  designElementName: expected,
  lastChangedBy: "system:integrity-fix:designElementName",
  updatedAt: Date.now(),
}

Batched Operations

Fixes handle large datasets via pagination:

typescript
while (!isDone) {
  const page = await ctx.db
    .query("issues")
    .paginate({ cursor, numItems: 100 });

  for (const record of page.page) {
    // Process each record
  }

  cursor = page.continueCursor;
  isDone = page.isDone;
}

Discord Integration

Setup

Required environment variables:

bash
DISCORD_BOT_TOKEN=your_bot_token
DISCORD_INTEGRITY_CHANNEL_ID=123456789012345678
DISCORD_INTEGRITY_MENTION_USER_ID=987654321098765432
CONVEX_DASHBOARD_URL=https://dashboard.convex.dev/t/team/project/deployment

Alert Format

Discord receives styled embeds with:

  • Status emoji per check (pass/fail/warning)
  • Record counts and duration
  • Error details (max 5 shown)
  • Button linking to Convex dashboard
@YourName Data Integrity Audit Complete (Milestone 75)

✅ Milestone Consistency
Checked: 847 records | Duration: 234ms

❌ Total Time Consistency
Checked: 523 records | Duration: 189ms
Errors (3):
• DOD abc123: expected 40, actual 35
• DOD def456: expected 20, actual 25

[🔍 View in Convex Dashboard]

Automatic Scheduling

A daily cron job runs all checks at 6 AM UTC:

typescript
// packages/convex/crons.ts
crons.daily(
  "integrity-audit-daily",
  { hourUTC: 6, minuteUTC: 0 },
  internal.integrityAudit.runner.runAll,
  { sendDiscordAlert: true }
);

Storage Schema

Check Results

typescript
integrityCheckResults: defineTable({
  runId: v.string(),
  timestamp: v.number(),
  validator: v.string(),
  isValid: v.boolean(),
  checkedRecords: v.number(),
  duration: v.number(),
  errors: v.array(v.object({
    type: v.string(),
    message: v.string(),
    recordId: v.string(),
    expected: v.optional(v.string()),
    actual: v.optional(v.string()),
  })),
  expiresAt: v.number(),  // 30-day retention
})

Fix Results

typescript
integrityFixResults: defineTable({
  runId: v.string(),
  timestamp: v.number(),
  validator: v.string(),
  processed: v.number(),
  fixed: v.number(),
  failed: v.number(),
  skipped: v.number(),
  dryRun: v.boolean(),
  expiresAt: v.number(),
})

Quick Reference

ActionFunctionArgs
Run all checksrunner.runAll{ milestoneNumber?, sendDiscordAlert? }
Run one checkrunner.runCheck{ checkName, milestoneNumber? }
Fix validatorrunner.runFix{ validator, dryRun? }
Check + fixrunner.runCheckAndFix{ validator, autoFix, dryRun? }

Design Principles

  1. Dry-Run First - Always test fixes before applying
  2. Batched Operations - Handle 10k+ records via pagination
  3. System Attribution - All fixes marked for audit trail
  4. Graceful Degradation - Missing env vars skip Discord, don't crash
  5. 30-Day Retention - Results auto-expire
  6. No Hard Deletes - Use soft-delete only
  7. Modular Validators - Each check is independent
  8. Type Safety - Full TypeScript with strict mode

Internal Documentation