Chapter 7

Sequences, Identity, and Key Generation

Every primary key was generated somewhere. In Oracle, it was almost certainly a SEQUENCE — an explicit database object that the application calls to obtain the next integer. In PostgreSQL, the same role can be played by a SEQUENCE, a SERIAL column, or an IDENTITY column. The behavioral differences between these approaches, and the specific problems that arise when a full data load is followed by application writes that hit the same sequence, make key generation one of the most operationally important details in a migration.

Get it wrong and the first application write after cutover produces a duplicate key violation on the primary key. This failure is loud, unambiguous, and happens within seconds of cutover. Get it right and it never appears.

7.1 Oracle SEQUENCE Semantics

Oracle sequences are standalone database objects that generate monotonically increasing (or decreasing) integers. The essential parameters:

-- Oracle sequence definition
CREATE SEQUENCE account_seq
  START WITH 1
  INCREMENT BY 1
  NOCACHE
  NOCYCLE
  NOMAXVALUE;

Applications call account_seq.NEXTVAL to obtain the next value and account_seq.CURRVAL to read the current value within the same session.

CACHE

The CACHE parameter tells Oracle to pre-generate and cache a block of sequence values in memory. A sequence with CACHE 20 pre-generates 20 values; these are served from memory without a dictionary write until the cache is exhausted, then the next block is generated.

The operational implication: if the database restarts while cached values are in memory, those values are lost. After restart, the sequence continues from the next uncached block. A sequence that started at 1 with CACHE 20 might produce: 1, 2, 3, ..., 20, 21, ..., 40, [restart], 61, 62, ... with gap 41–60 permanently missing.

Applications that assume sequence values are gapless (using them as contiguous record numbers, for example) will behave incorrectly after any cache flush. This is a pre-existing Oracle behavior that migration does not introduce but must preserve.

NOCACHE sequences write to the data dictionary on every call. They are slower (every NEXTVAL involves a dictionary row update) but produce gapless sequences in the absence of transaction rollbacks (which always create gaps regardless of CACHE).

NOCYCLE vs MAXVALUE

A sequence with NOCYCLE raises ORA-08004: sequence exhausted when it reaches MAXVALUE. A sequence with CYCLE wraps back to MINVALUE. Most production Oracle sequences are NOCYCLE with NOMAXVALUE (or a very large MAXVALUE like 9999999999999999999). Migration must preserve this property.

NEXTVAL and the Application Pattern

Oracle applications typically use sequences in one of two patterns:

Pattern A: NEXTVAL in INSERT

INSERT INTO account (account_id, user_name, created_at)
VALUES (account_seq.NEXTVAL, :user_name, SYSDATE);

Pattern B: NEXTVAL before INSERT

SELECT account_seq.NEXTVAL INTO v_account_id FROM DUAL;
INSERT INTO account (account_id, user_name, created_at)
VALUES (v_account_id, :user_name, SYSDATE);
-- v_account_id used in subsequent code (e.g., to insert child rows)

Both patterns require the application to call the PostgreSQL sequence or identity mechanism after migration.

7.2 PostgreSQL Sequence Semantics

PostgreSQL sequences are also standalone objects with similar parameters:

-- PostgreSQL sequence
CREATE SEQUENCE account_seq
  START WITH 1
  INCREMENT BY 1
  NO CYCLE;

The function call syntax replaces Oracle's .NEXTVAL syntax:

-- PostgreSQL equivalent of account_seq.NEXTVAL
SELECT nextval('account_seq');

-- Equivalent of account_seq.CURRVAL
SELECT currval('account_seq');

In PostgreSQL, currval returns the most recently returned value from nextval in the current session — same session-scoped semantics as Oracle's CURRVAL.

SERIAL and BIGSERIAL

SERIAL is a PostgreSQL shorthand that creates a sequence, sets its ownership to the column, and sets the column default to nextval('sequence_name'):

CREATE TABLE account (
  account_id SERIAL PRIMARY KEY,
  user_name  TEXT
);
-- Equivalent to:
CREATE SEQUENCE account_account_id_seq;
CREATE TABLE account (
  account_id INTEGER NOT NULL DEFAULT nextval('account_account_id_seq') PRIMARY KEY,
  user_name  TEXT
);
ALTER SEQUENCE account_account_id_seq OWNED BY account.account_id;

BIGSERIAL does the same with BIGINT.

IDENTITY Columns (PostgreSQL 10+)

GENERATED ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY are the SQL-standard way to auto-generate primary keys, recommended over SERIAL for new schemas:

CREATE TABLE account (
  account_id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  user_name  TEXT
);

GENERATED ALWAYS prevents explicit INSERT of a value into account_id (unless OVERRIDING SYSTEM VALUE is used). GENERATED BY DEFAULT allows explicit inserts (matching Oracle sequence behavior where the application can insert any value into the PK column).

Product note: Pulsaride Transform migrates Oracle sequences as PostgreSQL SEQUENCE objects (not SERIAL or IDENTITY), and the DDL sets the column default to nextval('sequence_name'). This preserves the explicit sequence object that can be independently managed and reset, rather than burying it in column metadata. Application code using Pattern A or Pattern B must be updated to use nextval('account_seq') in place of account_seq.NEXTVAL.

7.3 The Sequence Value Drift Problem

The most operationally significant sequence issue in migration is sequence value drift after full load.

Here is what happens:

  1. Oracle account_seq has produced values 1 through 847,331 before the full load begins.
  2. The full load copies all 847,331 rows from Oracle account to PostgreSQL account, including the account_id values.
  3. The PostgreSQL account_seq was created with START WITH 1 and has never been called.
  4. After cutover, the application inserts a new account and calls nextval('account_seq'), which returns 1.
  5. The INSERT fails with duplicate key value violates unique constraint "pk_account" because account_id = 1 already exists.

This failure is deterministic and will happen on the first insert after cutover unless the sequence is advanced.

The Fix: Sequence Advancement After Full Load

After the full load completes, for every sequence-driven primary key column, the corresponding PostgreSQL sequence must be advanced to a value greater than the maximum currently in the table.

-- Find the maximum account_id loaded from Oracle
SELECT MAX(account_id) FROM account;
-- Returns: 847331

-- Advance the PostgreSQL sequence to avoid conflict
-- Add a safety margin (e.g., +1000) to account for in-flight Oracle transactions
-- that may not yet be in the full load snapshot
SELECT setval('account_seq', 848331);

-- Verify
SELECT last_value FROM account_seq;
-- Returns: 848331

The safety margin accounts for Oracle writes that occurred between the full load snapshot timestamp and the moment CDC catches up to real-time. If Oracle advanced the sequence to 848,000 during the CDC catch-up window, a PostgreSQL sequence reset to 847,332 would still produce duplicates during CDC replay.

Product note: Pulsaride Transform includes an automated sequence advancement step that runs after the full load and before the post-load DDL (constraint creation). For each table with a sequence-backed primary key, the step:

  1. Queries MAX(pk_column) from the target table
  2. Queries the current Oracle sequence value from DBA_SEQUENCES.LAST_NUMBER
  3. Sets the PostgreSQL sequence to MAX(oracle_last_number, max_target_pk) + safety_margin

The safety_margin is configurable (default: 10,000) and should be set to a value larger than the maximum number of Oracle inserts expected during the CDC catch-up window. For high-volume tables (>1,000 inserts/hour), increase the margin proportionally.

# pulsaride-transform configuration
sequenceAdvancement:
  enabled: true
  safetyMargin: 10000
  tables:
    - table: account
      pkColumn: account_id
      sequence: account_seq
    - table: order_header
      pkColumn: order_id
      sequence: order_header_seq

7.4 Oracle SEQUENCE Features Without Direct Equivalents

ORDER Clause (Oracle RAC)

Oracle sequences in a RAC environment can specify ORDER to guarantee that NEXTVAL calls across all RAC nodes are strictly monotonically increasing. Without ORDER, each RAC node may serve values from its own cache block, so the sequence is ascending per-node but not globally ordered.

PostgreSQL has no RAC equivalent, so this distinction does not apply. PostgreSQL sequences are served from a single sequence object and are globally monotonically increasing.

SEQUENCE in TRIGGER (Oracle Pattern)

Oracle applications commonly use a BEFORE INSERT trigger to populate the primary key from a sequence, allowing the application to insert without specifying the PK:

-- Oracle trigger
CREATE OR REPLACE TRIGGER account_bi
  BEFORE INSERT ON account
  FOR EACH ROW
BEGIN
  IF :NEW.account_id IS NULL THEN
    SELECT account_seq.NEXTVAL INTO :NEW.account_id FROM DUAL;
  END IF;
END;

This pattern is compatible with PostgreSQL using a PL/pgSQL trigger, but the recommended migration approach is to eliminate the trigger and instead set the column default:

-- PostgreSQL: column default replaces trigger
ALTER TABLE account ALTER COLUMN account_id SET DEFAULT nextval('account_seq');

Applications that insert with account_id = NULL or omit the column will receive a sequence value from the default. The trigger approach works but adds per-row trigger overhead on every insert.

Product note: Pulsaride Transform detects the sequence-in-trigger pattern during trigger analysis (covered in Chapter 8) and flags it as TRIGGER_SEQUENCE_PATTERN. For these triggers, the recommended resolution is column default replacement, not trigger migration. The assessment report includes a generated ALTER TABLE statement for the operator's review.

7.5 UUID Keys

Some Oracle schemas use VARCHAR2 primary keys populated with GUIDs generated by SYS_GUID() or by the application using UUID libraries. These migrate directly to PostgreSQL UUID or TEXT columns.

-- Oracle
CREATE TABLE session_token (
  token_id   VARCHAR2(36) PRIMARY KEY DEFAULT SYS_GUID(),
  user_id    NUMBER NOT NULL,
  expires_at DATE NOT NULL
);
-- Note: SYS_GUID() returns raw bytes, not formatted UUID
-- PostgreSQL (using uuid-ossp extension)
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

CREATE TABLE session_token (
  token_id   UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
  user_id    INTEGER NOT NULL,
  expires_at TIMESTAMP WITH TIME ZONE NOT NULL
);

Oracle's SYS_GUID() returns a RAW(16) value. When stored in VARCHAR2(36), it is typically hex-encoded without dashes (32 characters) rather than UUID format with dashes (36 characters). During migration, the full load must normalize these values — converting the Oracle hex string to UUID format or preserving them as-is and using TEXT instead of UUID in PostgreSQL.

Product note: Pulsaride Transform detects columns named *_ID or *_GUID with VARCHAR2(32) or VARCHAR2(36) type and examines sample values to determine whether they are SYS_GUID values. If so, a UUID_COLUMN finding is emitted with two options: migrate as TEXT (preserving exact values) or migrate as UUID (converting format). The TEXT option is safer and is the default.

7.6 Sequence Gaps Are Normal; Plan for Them

Applications that treat sequence values as dense counters — displaying "Order #847331" and expecting the next to be "Order #847332" — will encounter gaps. Gaps arise from:

  • Transaction rollbacks (NEXTVAL is never rolled back, even if the INSERT is)
  • CACHE flush on restart
  • Manual sequence advancement during migration
  • Load testing or development inserts that were rolled back

This is pre-existing Oracle behavior. Migration does not introduce gaps, but the sequence advancement step after full load creates a deliberate gap between the last loaded value and the sequence start point. This gap is visible in reporting dashboards and can confuse operations teams who are not prepared for it.

Communication checkpoint: Before cutover, brief the operations team on the expected sequence gap. Document the pre-migration last value, the safety margin, and the post-advancement starting value. A post-cutover "missing order numbers" incident that is actually expected sequence advancement can consume significant incident response time.

7.7 Example: Multi-Table Sequence Advancement in a Financial Schema

A financial application uses sequences for five tables: account, transaction, payment, ledger_entry, and reconciliation_batch.

Oracle sequence states at full load snapshot time:

Sequence Last Oracle Value Max in Target Table
account_seq 847,331 847,290 (in-flight delta)
transaction_seq 24,891,044 24,886,201
payment_seq 6,102,887 6,101,990
ledger_entry_seq 78,440,321 78,423,100
recon_batch_seq 4,421 4,421

The safety margin is 10,000. The advancement step sets:

SELECT setval('account_seq',      857331);   -- 847331 + 10000
SELECT setval('transaction_seq',  24901044); -- 24891044 + 10000
SELECT setval('payment_seq',       6112887); -- 6102887 + 10000
SELECT setval('ledger_entry_seq', 78450321); -- 78440321 + 10000
SELECT setval('recon_batch_seq',    14421);  -- 4421 + 10000

During CDC catch-up (which covers the window between full load snapshot and real-time), Oracle inserts arrive in the staging area. CDC events for account show values up to 847,416 (85 new rows post-snapshot). The sequence was advanced to 857,331 — well clear of the in-flight values. CDC replay applies those rows to PostgreSQL using their Oracle-generated primary keys (not calling the sequence), so the sequence advance does not conflict with CDC replay. The sequence only activates when the application writes to PostgreSQL after cutover, at which point it starts at 857,331, safely beyond all replayed values.

This example illustrates why the safety margin must account for the entire CDC catch-up window, not just the instant of full load completion.

← Previous

Chapter 6Constraints, Indexes, and DDL Migration

Next →

Chapter 8PL/SQL to PL/pgSQL