Chapter 15

Reconciliation-Driven Validation

Reconciliation is the process of comparing Oracle and PostgreSQL to confirm that they contain the same data. It is not a final step run once before cutover. It is a continuous process that runs from the moment the full load completes until the moment of cutover — and continues after cutover for a defined period to confirm the switched system is still correct.

This chapter covers the four reconciliation modes, how tolerances are configured, and how the pipeline classifies mismatches as blocking or advisory.

15.1 What Reconciliation Validates

Reconciliation validates four properties:

Completeness: Every row in Oracle exists in PostgreSQL. No rows were lost in the full load or missed by CDC.

Correctness: For every row that exists in both systems, the column values are identical (or within defined tolerance for numeric and timestamp types).

Structural consistency: The PostgreSQL schema correctly represents the Oracle schema — column counts, data types, constraint definitions.

Temporal consistency: Changes made in Oracle after the full load anchor SCN have been replicated to PostgreSQL through CDC. The two systems represent the same point in time.

Reconciliation does not validate:

Application behavior after cutover
Performance characteristics of PostgreSQL under load
Index effectiveness for specific query patterns

These are cutover readiness items (Chapter 19), not reconciliation items.

15.2 Reconciliation Modes

Four reconciliation modes address different aspects of data consistency at different cost/coverage tradeoffs:

Mode 1: Count Reconciliation

The simplest mode: compare the row count in Oracle against the row count in PostgreSQL.

-- Oracle
SELECT COUNT(*) FROM account;
-- Returns: 847,331

-- PostgreSQL
SELECT COUNT(*) FROM account;
-- Returns: 847,331

Count reconciliation is fast (milliseconds for indexed tables), runs continuously, and detects missing rows immediately. It does not detect rows with correct count but wrong values.

Product note: Count reconciliation runs every 60 seconds per table by default. If CDC lag is < 30 seconds and count matches within the configured tolerance, the table is marked GREEN. Count delta outside tolerance: YELLOW (advisory) or RED (blocking), based on configuration.

Mode 2: Hash Reconciliation

Hash reconciliation computes a hash of every row's data and compares aggregated hashes between Oracle and PostgreSQL.

-- Oracle: row-level hash (using ORA_HASH)
SELECT ORA_HASH(
  account_id || '|' || NVL(user_name, '') || '|' || NVL(email, '') ||
  '|' || NVL(status, '') || '|' || TO_CHAR(created_at, 'YYYY-MM-DD HH24:MI:SS')
) AS row_hash
FROM account
ORDER BY account_id;

-- PostgreSQL: row-level hash (using MD5)
SELECT MD5(
  account_id::TEXT || '|' || COALESCE(user_name, '') || '|' || COALESCE(email, '') ||
  '|' || COALESCE(status, '') || '|' || TO_CHAR(created_at, 'YYYY-MM-DD HH24:MI:SS.US')
) AS row_hash
FROM account
ORDER BY account_id;

The aggregated hash (XOR or concatenation of all row hashes) allows detecting any value-level difference without comparing every row individually. If the aggregated hash matches, all rows match. If it differs, a binary search over hash ranges identifies which rows differ.

Cost: Full table scan on both Oracle and PostgreSQL. For a 100-million-row table, this is expensive (minutes to hours). Hash reconciliation is run on a schedule (every 6 hours during parallel running) rather than continuously.

Product note: Pulsaride Transform computes hash reconciliation in segments: the table is divided into hash segments (typically 1% of rows each), and the hash for each segment is computed and compared. If a segment hash matches, the segment is skipped. If a segment hash differs, the segment is drilled down to individual rows. This segment-based approach reduces the comparison to only the rows that actually differ.

Mode 3: Sample Reconciliation

Sample reconciliation reads a random sample of rows from Oracle and compares them column-by-column to the corresponding rows in PostgreSQL.

-- Oracle: random 1% sample
SELECT account_id, user_name, email, status, balance, created_at
FROM account SAMPLE(1)
ORDER BY account_id;

-- PostgreSQL: fetch the sampled IDs
SELECT account_id, user_name, email, status, balance, created_at
FROM account
WHERE account_id IN (:sampled_ids)
ORDER BY account_id;

Sample reconciliation is faster than hash reconciliation (reads 1% of rows rather than 100%) and detects value-level errors in the sample. It does not guarantee that rows outside the sample are correct.

For high-value tables (financial transactions, customer records), sample reconciliation can be configured with a higher sample rate (10%, 25%) or targeting specific high-risk row segments (recently updated rows, rows that were rejected during full load and re-loaded).

Product note: Pulsaride Transform's sample reconciliation is query-based — it does not store the sample. On each run, a new random sample is drawn from Oracle, the same row IDs are fetched from PostgreSQL, and the column-by-column comparison is performed in the pipeline process. This means each run tests a different set of rows.

Mode 4: Aggregate Reconciliation

Aggregate reconciliation computes business-meaningful aggregates over the data and compares them:

-- Oracle aggregates
SELECT
  status,
  COUNT(*) AS row_count,
  SUM(balance) AS total_balance,
  AVG(balance) AS avg_balance,
  MIN(created_at) AS earliest_account,
  MAX(created_at) AS latest_account
FROM account
GROUP BY status;

-- PostgreSQL aggregates (same query)
SELECT
  status,
  COUNT(*) AS row_count,
  SUM(balance) AS total_balance,
  AVG(balance) AS avg_balance,
  MIN(created_at) AS earliest_account,
  MAX(created_at) AS latest_account
FROM account
GROUP BY status;

Aggregate reconciliation answers business-level questions: "Is the total balance in PostgreSQL equal to the total balance in Oracle?" This is the reconciliation mode that the business understands and trusts, even if the technical details of hash or sample modes are opaque.

Product note: Aggregate reconciliation queries are defined in the pipeline YAML, allowing the migration team to add domain-specific aggregates that reflect the business KPIs most important to the cutover decision:

reconciliation:
  tables:
    - name: account
      aggregates:
        - name: total_balance_by_status
          query: |
            SELECT status, COUNT(*), SUM(balance)
            FROM account
            GROUP BY status
          tolerancePercent: 0.001   # 0.001% tolerance for floating-point rounding
        - name: active_account_count
          query: |
            SELECT COUNT(*) FROM account WHERE status = 'ACTIVE'
          toleranceAbsolute: 0      # exact match required

15.3 Configuring Tolerances

Not all mismatches are errors. Some mismatches are expected and acceptable:

Expected mismatches during CDC catch-up:

Rows that were modified in Oracle after the full load anchor SCN but before CDC has applied the changes to PostgreSQL. These produce a temporary count or value mismatch that resolves as CDC catches up.
New rows inserted in Oracle during CDC catch-up that have not yet been applied to PostgreSQL.

Floating-point rounding differences:

Oracle NUMBER(12,6) and PostgreSQL NUMERIC(12,6) may produce slightly different results for arithmetic operations due to implementation differences in rounding. Typically < 0.0001% difference.

Timestamp microsecond differences:

Oracle DATE has second precision. PostgreSQL TIMESTAMP has microsecond precision. A date value of 2024-01-15 14:30:00 in Oracle becomes 2024-01-15 14:30:00.000000 in PostgreSQL — technically the same, but comparison code that does not account for this will flag it as a mismatch.

Tolerance configuration:

reconciliation:
  globalTolerances:
    countDeltaAbsolute: 0          # zero count difference (outside CDC window)
    countDeltaPercent: 0           # zero percent difference
    numericDeltaPercent: 0.0001    # 0.0001% for numeric columns
    timestampDeltaSeconds: 0       # zero for TIMESTAMP, handled by type mapping

  cdcCatchupTolerances:
    # Applied while CDC lag > lagThresholdSeconds
    lagThresholdSeconds: 300        # 5-minute lag = catch-up mode
    countDeltaAbsolute: 1000       # allow up to 1000 row difference during catch-up
    countDeltaPercent: 0.1         # allow up to 0.1% difference during catch-up

15.4 Blocking vs Advisory Mismatches

Reconciliation findings are classified into two severity levels:

Blocking mismatches: Mismatches that must be resolved before cutover can proceed. Examples:

Count difference > 0 when CDC lag < 30 seconds
Aggregate total_balance difference > configured tolerance
Hash mismatch in a financial transaction table

Advisory mismatches: Mismatches that are expected, documented, or below tolerance. They are recorded in the reconciliation report but do not block cutover. Examples:

Count difference of 42 rows when CDC lag is 4 minutes (expected, will close as lag decreases)
Floating-point balance difference of 0.00003% (within tolerance)
NULL vs empty string difference in a column that has the normalizeEmpty: true rule

reconciliation:
  tables:
    - name: account
      blockingThresholds:
        countDeltaAbsolute: 0
        aggregateDeltaPercent:
          total_balance: 0.0001
      advisoryThresholds:
        countDeltaAbsolute: 100

Product note: The monitoring dashboard displays blocking mismatches in RED and advisory mismatches in YELLOW. A table is GREEN when all reconciliation modes pass within their configured thresholds. The cutover command checks that zero tables are in RED before proceeding.

15.5 Reconciliation Scheduling During Parallel Running

During the parallel running phase (after CDC has caught up to real-time), reconciliation runs on a staggered schedule to balance coverage against load:

Every 60 seconds:  Count reconciliation for all tables
Every 6 hours:     Hash reconciliation for tier-1 tables (financial, customer)
Every 12 hours:    Hash reconciliation for tier-2 tables (reference, configuration)
Every 24 hours:    Hash reconciliation for tier-3 tables (audit, history)
Continuously:      Sample reconciliation (1% sample, rolling over all tables in 4-hour cycle)
Before cutover:    Full hash reconciliation for all blocking-tier tables

The pre-cutover full hash reconciliation is the final gate. It runs as part of the cutover checklist and must complete with zero blocking mismatches before the cutover command is available.

15.6 Example: The `ledger_entry` Reconciliation Sequence

The ledger_entry table has 78 million rows and is the financial heart of the application. It accumulates one row per financial event. Reconciliation configuration:

reconciliation:
  tables:
    - name: ledger_entry
      tier: 1
      aggregates:
        - name: total_credits
          query: SELECT SUM(amount) FROM ledger_entry WHERE entry_type = 'CREDIT'
          tolerancePercent: 0
        - name: total_debits
          query: SELECT SUM(amount) FROM ledger_entry WHERE entry_type = 'DEBIT'
          tolerancePercent: 0
        - name: row_count_by_type
          query: SELECT entry_type, COUNT(*) FROM ledger_entry GROUP BY entry_type
          toleranceAbsolute: 0
      blockingThresholds:
        countDeltaAbsolute: 0
        aggregateDeltaPercent:
          total_credits: 0
          total_debits: 0

Day 1 of parallel running: count reconciliation shows a delta of 41,221 rows (PostgreSQL has 41,221 fewer rows than Oracle). CDC lag is 6 hours. Status: YELLOW (advisory — within CDC catch-up tolerance). The delta is consistent with the CDC lag × average insert rate.

Day 2: CDC lag is 18 seconds. Count delta is now 12 rows. Status: YELLOW (advisory — 12 rows likely in-flight from CDC).

Day 2, 4 hours later: Count delta is 0. Hash reconciliation runs. Segment 73 of 100 shows a hash mismatch. Drill-down identifies 3 rows with a balance discrepancy of $0.01 each. Investigation: Oracle stores AMOUNT as NUMBER(15,4) and PostgreSQL stores it as NUMERIC(15,4), but a mid-migration normalization rule applied a rounding operation that introduced a $0.01 difference in 3 rows.

Status: RED (blocking — zero tolerance on financial amounts).

Resolution: the 3 rows are identified, the correct amounts are fetched from Oracle, and compensating writes are applied to PostgreSQL. Hash reconciliation re-runs: PASS. Status: GREEN.

Day 3, pre-cutover: Full hash reconciliation runs on all 78 million rows. Duration: 14 minutes. Result: PASS. The ledger_entry table is cleared for cutover.

← Previous

Chapter 14Performance and Parallelism

Chapter 16Making Sense of Reconciliation Outputs