Chapter 1

Chapter 1: Why Pulsaride Transform Exists

Most migration programs start with the wrong picture of the problem. They think they are buying or building a schema conversion utility. In practice, they are taking responsibility for moving a live operational system from one database engine to another without losing data, breaking application semantics, or spending six months in manual cleanup after cutover.

That gap between the imagined problem and the real one is why Pulsaride Transform exists.

Pulsaride Transform is not meant to be a thin translation layer from Oracle DDL to PostgreSQL DDL. It is meant to be a migration product: a system that can assess a source estate, express transformations declaratively, execute full load and CDC safely, validate convergence continuously, and support an operator all the way to cutover.

The need for such a product becomes obvious as soon as we study the reconciliation and CDC design material behind it. The hard parts are not cosmetic. They are structural:

Oracle and PostgreSQL do not behave the same on empty strings, identifiers, timestamps, sequences, and procedural code.
A CDC feed is not the truth by itself. It is only one input into a broader migration workflow.
A row can be technically valid and still impossible to apply because its parent has not arrived yet.
A migration can look green at the schema level while failing at the reconciliation level.
A pipeline that cannot explain why data is waiting, diverging, or replaying is not production-ready.

This chapter sets the frame for the rest of the book: what the product must really do, why naïve migration tooling is not enough, and which failure modes shaped the architecture.

1.1 The real scope of an Oracle -> PostgreSQL migration product

The real scope of a migration product is much larger than translation.

At minimum, the product must own six responsibilities:

Assessment It must inspect the Oracle estate before migration begins: schemas, tables, sequences, views, packages, triggers, constraints, indexes, data volumes, and dependency graphs.
Transformation It must express how data and structures move from Oracle semantics to PostgreSQL semantics, ideally through reusable declarative rules rather than one-off scripts.
Movement It must execute both the initial full load and the incremental CDC flow, with restartability and observability.
Reconciliation It must prove that source and target converge, not merely assume that they do.
Cutover It must support the operational switch from source to target with explicit readiness criteria.
Recovery It must make failures diagnosable and replayable instead of turning them into manual archaeology.

This is why a real product ends up needing components that do not look like classic conversion tooling:

a durable staging area
a dependency model
a retry and replay model
a validation model
a metrics and alerting model
a cutover readiness model

The design artifacts behind Pulsaride Transform already point in this direction. The CDC architecture assumes a staging_event table that acts as a durable event journal, a state machine to track row-level progress, dependency resolution for parent-child ordering, and reconciliation services that detect drift and trigger replay. That is not a schema converter. That is the backbone of a migration operating system.

The scope also has to include what migration teams usually underestimate:

data quality defects already present in the source
source semantics encoded in triggers and packages rather than tables
runtime timing problems between full load and CDC
referential timing issues between child rows and parent rows
operational proof that the system is actually ready for cutover

If the product does not own these concerns, they do not disappear. They simply get outsourced to shell scripts, spreadsheets, manual SQL, and war-room decisions.

That is how migrations become fragile.

1.2 Why schema conversion is not enough

Schema conversion is necessary, but it is nowhere near sufficient.

A converted schema can be syntactically correct and still be operationally wrong. Consider a few common examples:

The same type does not mean the same behavior

An Oracle NUMBER column without clear precision may end up as numeric, bigint, or integer in PostgreSQL. A poor choice can create silent truncation, broken indexes, or application overflows later.

Oracle DATE contains time information. PostgreSQL date does not. A careless mapping drops behavior, not just bytes.

The database accepts different values

Oracle treats the empty string as NULL in many contexts. PostgreSQL does not. A migrated unique constraint or application query can change behavior even if the column name and apparent values look identical.

The source logic lives outside the table definition

Business behavior is often encoded in:

PL/SQL packages
triggers
sequence-populating logic
audit tables
jobs and schedulers

A clean PostgreSQL DDL file tells us nothing about whether the target system preserves those behaviors.

Runtime order matters

Even if the target schema is perfect, applying rows in the wrong order still fails. A child row may be structurally valid and still be impossible to write because the parent has not been promoted yet. This is exactly why the reconciliation/CDC design introduces statuses such as WAITING_DEP and WAITING_REF.

The shape of the CDC feed matters

The architecture documented for Pulsaride Transform assumes AWS DMS emits JSON with the post-operation state only. That means:

there may be no full before image
reconciliation cannot rely on perfect event history
delete handling needs explicit strategy
replay must be idempotent

Again, none of this is solved by schema conversion.

The migration is judged on outcomes, not generated files

No stakeholder will say the migration succeeded because the DDL compiled. They will ask:

Is the target complete?
Is the target consistent?
Are writes still flowing?
Can we explain the drift?
Can we cut over safely?

Schema conversion helps with one part of one of those questions. This is precisely where most DMS-based pipelines and schema conversion utilities stop — and where the real migration problem begins.

That is why Pulsaride Transform has to treat schema conversion as one module inside a wider product, not as the product itself.

1.3 Why CDC, full load, cutover, and reconciliation must be designed together

Migration teams often design these pieces separately:

full load as a batch concern
CDC as a streaming concern
reconciliation as a reporting concern
cutover as an operational checklist

That separation is convenient for slide decks and dangerous in production.

These concerns are coupled by the same timeline.

The real migration timeline

In a live migration, the usual shape is:

capture a source watermark
start or prepare CDC
execute the full load
replay overlap changes
continue live replication
validate convergence
cut over

If these steps are not designed as one system, the handoffs become unsafe.

Full load without CDC creates a data gap

If a row changes while the initial load is still running, the target becomes stale immediately unless CDC is capturing the overlap window correctly.

CDC without reconciliation creates false confidence

A green consumer lag graph does not prove target correctness. Messages can be acknowledged, staged, retried, dead-lettered, or replayed while lag still looks acceptable.

Reconciliation without replay creates operator pain

If validation can detect drift but the product cannot target a replay, the only recovery path is manual SQL or a partial re-migration.

Cutover without integrated readiness criteria is theater

Manual go/no-go checklists with no automated gate enforcement are the root cause of the 4-hour cutovers that turn into partial rollbacks. You cannot make a safe go/no-go decision without combining:

source-target data convergence
CDC lag
unresolved dependencies
dead-letter backlog
sequence state
constraint readiness

That is why the architecture behind Pulsaride Transform emphasizes a staged state model and periodic reconciliation. It recognizes that migration correctness is not a single moment. It is a continuously maintained condition.

The product therefore has to design these elements together:

full load watermarking
CDC ingestion and idempotent replay
dependency resolution
reconciliation signals
cutover readiness criteria

If one of those is left out of the product and delegated to manual process, the whole chain weakens.

1.4 The migration failure modes this book is built around

This book is not organized around abstract best practices. It is organized around concrete failure modes that repeatedly appear in real Oracle -> PostgreSQL migrations and are visible in the reconciliation-oriented architecture work.

Failure mode 1: "The schema migrated, therefore the data will work"

It will not. Semantics drift through types, default values, sequences, empty strings, case handling, and procedural logic. Most teams assume they will validate this at the end. The data shows consistently that migrations without continuous reconciliation do not know they have a problem until cutover.

Failure mode 2: "CDC is running, therefore the target is current"

Not necessarily. The pipeline may have:

rows stuck in WAITING_DEP
rows stuck in WAITING_REF
rows looping through ERROR
replay debt hidden behind acceptable Kafka lag

Failure mode 3: "Counts match, therefore the migration is correct"

Counts are a weak signal. Two systems can have the same row count and still disagree on:

business keys
timestamps
status flags
denormalized columns
sequence-backed identifiers

Failure mode 4: "A retry mechanism is enough"

Blind retries do not fix dependency cycles, bad mappings, or missing reference data. They only turn deterministic failure into delayed deterministic failure.

Failure mode 5: "A dead-letter queue is an edge-case bucket"

In immature migration products, the dead-letter queue becomes the real backlog of unresolved product gaps:

unsupported data shapes
missing transform rules
incorrect delete semantics
ordering assumptions that do not hold

If dead letters are not converted into product improvements, the product stops learning.

Failure mode 6: "Parallelism is always better"

Higher concurrency can improve throughput, but it can also amplify:

hot partitions
ordering loss
lock contention
duplicate work
uneven backlog by table

Failure mode 7: "Cutover is a calendar event"

Cutover is not safe because the weekend has been reserved. It is safe only when the product can prove:

the backlog is drained
unresolved dependencies are near zero or explained
reconciliation is within tolerance
recovery paths are understood

Failure mode 8: "Manual fixes are acceptable because this is a one-time migration"

This is one of the most expensive lies in data migration. Manual fixes accumulate in notebooks and SQL files, and then the next wave starts from scratch. A product exists precisely to convert those fixes into reusable capabilities.

Failure mode 9: "We can discover data behavior after cutover"

By then, every issue becomes more expensive:

business users see broken records
rollback windows shrink
trust drops
manual intervention expands

This book is built around avoiding those failure modes by design, not by heroics.

1.5 A tour of the product: assessment, transform, CDC, validation, cutover

Pulsaride Transform can be understood as five connected layers.

Assessment

Assessment answers one question: what are we actually migrating?

The product must inspect:

Oracle schemas and objects
data volumes and table hotness
PK/FK structure
sequences and identity patterns
triggers, packages, and jobs
type risks and incompatible constructs

The goal is not to produce a static report. It is to build the migration plan:

which objects can be migrated automatically
which need declarative transform rules
which need manual treatment
which dependencies define load order and replay logic

Transform

Transform answers the question: how should source data and behaviors be represented in PostgreSQL?

This includes:

schema mapping
column normalization
type conversion
lookup enrichment
deduplication
foreign-key-aware staging logic
procedural logic replacement where needed

The product should prefer declarative mappings and reusable steps over custom code because migration debt compounds quickly.

CDC

CDC answers the question: how do we keep the target moving while the source is still live?

Based on the architecture explored so far, this layer includes:

ingestion from a DMS-emitted Kafka stream
durable persistence to staging
dependency-aware resolution
promotion into target tables
idempotent replay
operator-visible state transitions

CDC is not just a connector. It is a controlled propagation system.

Validation

Validation answers the question: how do we know the target is converging correctly?

This layer must combine:

row counts
checksums
sample-based diffs
dependency backlog analysis
dead-letter analysis
sequence sanity checks
cutover readiness signals

Validation is where the product proves credibility. It is also where reconciliation findings are turned back into roadmap items.

Cutover

Cutover answers the final question: when is it safe to switch?

The product needs an explicit cutover model:

readiness criteria
lag thresholds
data quality thresholds
sequence reset checks
unresolved backlog policy
rollback strategy

A mature migration product does not merely move data. It supports decision-making under risk.

That is the real purpose of Pulsaride Transform.

It exists because Oracle -> PostgreSQL migration is not a translation problem. It is a controlled convergence problem across structure, data, time, and operations. The rest of this book explains how to solve that problem without falling back to ad hoc scripts and migration folklore.

Why This Matters in Practice

This chapter defines the practical criteria for the rest of the book.

If the reader sees Oracle -> PostgreSQL migration as a schema-conversion exercise, every later capability in Pulsaride Transform will look excessive. If the reader understands that migration is really a controlled convergence problem, then staging, replay, reconciliation, dependency handling, and cutover readiness stop looking like nice-to-haves and start looking like the core product.

That shift matters because it changes the architecture and the way the migration is approached. The right comparison is not "DDL converter versus DDL converter." The right comparison is "manual migration program with scattered scripts versus a solution that turns repeated migration pain into reusable capabilities."

Where Generic Approaches Fall Short

Generic migration tools usually do one or two things well:

convert schema objects
move rows from A to B

What they miss is the operational middle:

how to keep target data convergent while source data is still changing
how to explain blocked rows and replay debt
how to validate semantic correctness rather than just technical completion
how to make cutover a productized decision instead of a war-room judgment

That gap is exactly where most migration cost hides. The license may look cheap, but the missing operating model gets rebuilt manually through scripts, project-specific logic, and human coordination.

How Pulsaride Transform Helps

Pulsaride Transform is positioned as a migration product, not a converter, by combining five layers that reinforce each other:

assessment to discover real source-system scope
transform to express semantic conversion declaratively
CDC and full load orchestration to maintain convergence
validation and reconciliation to make drift visible and actionable
cutover readiness to support operational switching with evidence

In practice, this means the product is designed to absorb the recurring problems that usually escape tooling:

parent-child timing
replay and recovery
semantic drift
visibility into unresolved backlog
migration-readiness decisions

This is the architectural spine that the following chapters make concrete.

What This Means for Migration Teams

A migration lead does not need tooling just to obtain converted SQL. What matters is reducing program risk, shortening the custom-engineering phase, and improving confidence before cutover.

This chapter therefore sets a practical standard for the rest of the book: the more Pulsaride Transform can absorb assessment, transformation, replay, validation, and cutover discipline, the less the migration depends on fragile heroics and hard-to-scale expert intervention.

1.6 Why Not Build It Yourself?

Most teams underestimate migration complexity by an order of magnitude. The instinct to build is natural — the source schema is already understood, the team knows Java or Python, and the problem looks like it reduces to a JDBC SELECT and a COPY command. The reality is that the first home-built migration pipeline typically runs 3–4× the original timeline estimate, not because the engineers are not capable, but because the four structural problems (CDC overlap correctness, referential timing, reconciliation feedback loops, cutover gate enforcement) do not reveal themselves until the pipeline is confronted with real production data at scale.

If your current plan is “we’ll validate at the end,” you have already lost control of the migration. End-of-migration validation is not validation — it is archaeology on a system that has been running without oversight.

There are three common alternatives to Pulsaride Transform. Each is legitimate for some contexts. Here is an honest account of what each one provides, what it does not, and where the cost accumulates.

Option A — Build an Internal Pipeline

What you get: Full control over every technical decision. No external dependency. Potentially tight integration with your existing infrastructure.

What you do not get: The CDC-reconciliation-cutover chain is the hard part. It takes months to assemble and an incident during parallel running to debug correctly. Dependency resolution, staging as a durable audit trail, idempotent replay, and automated cutover gates are not features you add at the end — they are architecture decisions that have to be correct from the beginning.

The hidden cost: Every engineer hour spent building migration infrastructure is an engineer hour not spent on the application. The pipeline becomes a maintenance burden that outlives the migration and has no commercial incentive to improve. The second migration on the same estate starts nearly from scratch because the pipeline was built for the first schema, not for a repeatable capability.

Real signal: Teams that build their own migration pipelines average 3–4× the original timeline for the first migration. The second migration on a self-built pipeline typically takes almost as long as the first.

Option B — AWS DMS + Debezium (Assembled Tooling)

What you get: Solid CDC infrastructure. Well-documented connectors. Cloud-managed components with SLAs.

What you do not get: DMS handles data movement. It does not handle reconciliation, dependency ordering, staging as an audit trail, auto-generated compensating writes, or cutover readiness gates. Debezium solves the change capture problem — not the migration problem. You still need to build the integration layer between them, and that layer is where most of the structural complexity lives.

The hidden cost: Integration overhead between tools, gaps in reconciliation coverage, no pattern-to-rule automation. You end up building the surrounding system anyway. The only question is whether you discover that before or after the first failed cutover.

DMS can be used as the CDC layer inside a Pulsaride Transform deployment. The two are not mutually exclusive; but DMS alone is not a migration product.

Option C — External Consultancy

What you get: A team with migration experience who will run the project. They have seen schemas like yours before and can move quickly.

What you do not get: Knowledge transfer. When the engagement ends, everything the consultants learned about your system — the hidden foreign keys, the trigger behavior, the edge cases in CDC replay, the normalization rules that cleared 4,000 reconciliation discrepancies — leaves with them.

The hidden cost: Every Oracle estate has a second migration in it. A consultancy engagement answers the current problem; it does not compound. The next wave starts from zero again, with a new team, a new timeline estimate, and the same structural discoveries.

Consultancies are effective for execution support on a well-defined migration. They are not a substitute for a product that learns.

The Positioning Close

Pulsaride Transform is built to leave knowledge behind. Every reconciliation finding that resolves into a rule makes the next migration faster. Every normalization pattern that gets added to the rule library reduces the pre-cutover cleanup time on the next schema. The product compounds across migrations in a way that an assembled toolchain or a consultancy engagement structurally cannot.

If you are deciding now, the relevant questions are:

How many Oracle schemas does your organization have to migrate?
How much of your engineering team’s time can you afford to spend building and maintaining pipeline infrastructure?
What happens when the migration engineer who built the internal pipeline leaves or moves to a different project?

For a single small migration with a clean schema and a team that wants total control, building in-house is defensible. For anything with referential complexity, PL/SQL surface area, or more than one wave of migrations ahead, the build cost consistently exceeds the product cost within the first migration.

Start your migration with Pulsaride Transform for free →

← Previous

PrefacePreface

Chapter 2Reading the Source System