Sequences, Identity, and Key Generation
Every primary key was generated somewhere. In Oracle, it was almost certainly a SEQUENCE — an explicit database object that the application calls to obtain the next integer. In PostgreSQL, the same role can be played by a SEQUENCE, a SERIAL column, or an IDENTITY column. The behavioral differences between these approaches, and the specific problems that arise when a full data load is followed by application writes that hit the same sequence, make key generation one of the most operationally important details in a migration.
Get it wrong and the first application write after cutover produces a duplicate key violation on the primary key. This failure is loud, unambiguous, and happens within seconds of cutover. Get it right and it never appears.
7.1 Oracle SEQUENCE Semantics
Oracle sequences are standalone database objects that generate monotonically increasing (or decreasing) integers. The essential parameters:
-- Oracle sequence definition
CREATE SEQUENCE account_seq
START WITH 1
INCREMENT BY 1
NOCACHE
NOCYCLE
NOMAXVALUE;
Applications call account_seq.NEXTVAL to obtain the next value and account_seq.CURRVAL to read the current value within the same session.
CACHE
The CACHE parameter tells Oracle to pre-generate and cache a block of sequence values in memory. A sequence with CACHE 20 pre-generates 20 values; these are served from memory without a dictionary write until the cache is exhausted, then the next block is generated.
The operational implication: if the database restarts while cached values are in memory, those values are lost. After restart, the sequence continues from the next uncached block. A sequence that started at 1 with CACHE 20 might produce: 1, 2, 3, ..., 20, 21, ..., 40, [restart], 61, 62, ... with gap 41–60 permanently missing.
Applications that assume sequence values are gapless (using them as contiguous record numbers, for example) will behave incorrectly after any cache flush. This is a pre-existing Oracle behavior that migration does not introduce but must preserve.
NOCACHE sequences write to the data dictionary on every call. They are slower (every NEXTVAL involves a dictionary row update) but produce gapless sequences in the absence of transaction rollbacks (which always create gaps regardless of CACHE).
NOCYCLE vs MAXVALUE
A sequence with NOCYCLE raises ORA-08004: sequence exhausted when it reaches MAXVALUE. A sequence with CYCLE wraps back to MINVALUE. Most production Oracle sequences are NOCYCLE with NOMAXVALUE (or a very large MAXVALUE like 9999999999999999999). Migration must preserve this property.
NEXTVAL and the Application Pattern
Oracle applications typically use sequences in one of two patterns:
Pattern A: NEXTVAL in INSERT
INSERT INTO account (account_id, user_name, created_at)
VALUES (account_seq.NEXTVAL, :user_name, SYSDATE);
Pattern B: NEXTVAL before INSERT
SELECT account_seq.NEXTVAL INTO v_account_id FROM DUAL;
INSERT INTO account (account_id, user_name, created_at)
VALUES (v_account_id, :user_name, SYSDATE);
-- v_account_id used in subsequent code (e.g., to insert child rows)
Both patterns require the application to call the PostgreSQL sequence or identity mechanism after migration.
7.2 PostgreSQL Sequence Semantics
PostgreSQL sequences are also standalone objects with similar parameters:
-- PostgreSQL sequence
CREATE SEQUENCE account_seq
START WITH 1
INCREMENT BY 1
NO CYCLE;
The function call syntax replaces Oracle's .NEXTVAL syntax:
-- PostgreSQL equivalent of account_seq.NEXTVAL
SELECT nextval('account_seq');
-- Equivalent of account_seq.CURRVAL
SELECT currval('account_seq');
In PostgreSQL, currval returns the most recently returned value from nextval in the current session — same session-scoped semantics as Oracle's CURRVAL.
SERIAL and BIGSERIAL
SERIAL is a PostgreSQL shorthand that creates a sequence, sets its ownership to the column, and sets the column default to nextval('sequence_name'):
CREATE TABLE account (
account_id SERIAL PRIMARY KEY,
user_name TEXT
);
-- Equivalent to:
CREATE SEQUENCE account_account_id_seq;
CREATE TABLE account (
account_id INTEGER NOT NULL DEFAULT nextval('account_account_id_seq') PRIMARY KEY,
user_name TEXT
);
ALTER SEQUENCE account_account_id_seq OWNED BY account.account_id;
BIGSERIAL does the same with BIGINT.
IDENTITY Columns (PostgreSQL 10+)
GENERATED ALWAYS AS IDENTITY and GENERATED BY DEFAULT AS IDENTITY are the SQL-standard way to auto-generate primary keys, recommended over SERIAL for new schemas:
CREATE TABLE account (
account_id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
user_name TEXT
);
GENERATED ALWAYS prevents explicit INSERT of a value into account_id (unless OVERRIDING SYSTEM VALUE is used). GENERATED BY DEFAULT allows explicit inserts (matching Oracle sequence behavior where the application can insert any value into the PK column).
Product note: Pulsaride Transform migrates Oracle sequences as PostgreSQL SEQUENCE objects (not SERIAL or IDENTITY), and the DDL sets the column default to nextval('sequence_name'). This preserves the explicit sequence object that can be independently managed and reset, rather than burying it in column metadata. Application code using Pattern A or Pattern B must be updated to use nextval('account_seq') in place of account_seq.NEXTVAL.
7.3 The Sequence Value Drift Problem
The most operationally significant sequence issue in migration is sequence value drift after full load.
Here is what happens:
- Oracle
account_seqhas produced values 1 through 847,331 before the full load begins. - The full load copies all 847,331 rows from Oracle
accountto PostgreSQLaccount, including theaccount_idvalues. - The PostgreSQL
account_seqwas created withSTART WITH 1and has never been called. - After cutover, the application inserts a new account and calls
nextval('account_seq'), which returns1. - The INSERT fails with
duplicate key value violates unique constraint "pk_account"becauseaccount_id = 1already exists.
This failure is deterministic and will happen on the first insert after cutover unless the sequence is advanced.
The Fix: Sequence Advancement After Full Load
After the full load completes, for every sequence-driven primary key column, the corresponding PostgreSQL sequence must be advanced to a value greater than the maximum currently in the table.
-- Find the maximum account_id loaded from Oracle
SELECT MAX(account_id) FROM account;
-- Returns: 847331
-- Advance the PostgreSQL sequence to avoid conflict
-- Add a safety margin (e.g., +1000) to account for in-flight Oracle transactions
-- that may not yet be in the full load snapshot
SELECT setval('account_seq', 848331);
-- Verify
SELECT last_value FROM account_seq;
-- Returns: 848331
The safety margin accounts for Oracle writes that occurred between the full load snapshot timestamp and the moment CDC catches up to real-time. If Oracle advanced the sequence to 848,000 during the CDC catch-up window, a PostgreSQL sequence reset to 847,332 would still produce duplicates during CDC replay.
Product note: Pulsaride Transform includes an automated sequence advancement step that runs after the full load and before the post-load DDL (constraint creation). For each table with a sequence-backed primary key, the step:
- Queries
MAX(pk_column)from the target table - Queries the current Oracle sequence value from
DBA_SEQUENCES.LAST_NUMBER - Sets the PostgreSQL sequence to
MAX(oracle_last_number, max_target_pk) + safety_margin
The safety_margin is configurable (default: 10,000) and should be set to a value larger than the maximum number of Oracle inserts expected during the CDC catch-up window. For high-volume tables (>1,000 inserts/hour), increase the margin proportionally.
# pulsaride-transform configuration
sequenceAdvancement:
enabled: true
safetyMargin: 10000
tables:
- table: account
pkColumn: account_id
sequence: account_seq
- table: order_header
pkColumn: order_id
sequence: order_header_seq
7.4 Oracle SEQUENCE Features Without Direct Equivalents
ORDER Clause (Oracle RAC)
Oracle sequences in a RAC environment can specify ORDER to guarantee that NEXTVAL calls across all RAC nodes are strictly monotonically increasing. Without ORDER, each RAC node may serve values from its own cache block, so the sequence is ascending per-node but not globally ordered.
PostgreSQL has no RAC equivalent, so this distinction does not apply. PostgreSQL sequences are served from a single sequence object and are globally monotonically increasing.
SEQUENCE in TRIGGER (Oracle Pattern)
Oracle applications commonly use a BEFORE INSERT trigger to populate the primary key from a sequence, allowing the application to insert without specifying the PK:
-- Oracle trigger
CREATE OR REPLACE TRIGGER account_bi
BEFORE INSERT ON account
FOR EACH ROW
BEGIN
IF :NEW.account_id IS NULL THEN
SELECT account_seq.NEXTVAL INTO :NEW.account_id FROM DUAL;
END IF;
END;
This pattern is compatible with PostgreSQL using a PL/pgSQL trigger, but the recommended migration approach is to eliminate the trigger and instead set the column default:
-- PostgreSQL: column default replaces trigger
ALTER TABLE account ALTER COLUMN account_id SET DEFAULT nextval('account_seq');
Applications that insert with account_id = NULL or omit the column will receive a sequence value from the default. The trigger approach works but adds per-row trigger overhead on every insert.
Product note: Pulsaride Transform detects the sequence-in-trigger pattern during trigger analysis (covered in Chapter 8) and flags it as TRIGGER_SEQUENCE_PATTERN. For these triggers, the recommended resolution is column default replacement, not trigger migration. The assessment report includes a generated ALTER TABLE statement for the operator's review.
7.5 UUID Keys
Some Oracle schemas use VARCHAR2 primary keys populated with GUIDs generated by SYS_GUID() or by the application using UUID libraries. These migrate directly to PostgreSQL UUID or TEXT columns.
-- Oracle
CREATE TABLE session_token (
token_id VARCHAR2(36) PRIMARY KEY DEFAULT SYS_GUID(),
user_id NUMBER NOT NULL,
expires_at DATE NOT NULL
);
-- Note: SYS_GUID() returns raw bytes, not formatted UUID
-- PostgreSQL (using uuid-ossp extension)
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE session_token (
token_id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
user_id INTEGER NOT NULL,
expires_at TIMESTAMP WITH TIME ZONE NOT NULL
);
Oracle's SYS_GUID() returns a RAW(16) value. When stored in VARCHAR2(36), it is typically hex-encoded without dashes (32 characters) rather than UUID format with dashes (36 characters). During migration, the full load must normalize these values — converting the Oracle hex string to UUID format or preserving them as-is and using TEXT instead of UUID in PostgreSQL.
Product note: Pulsaride Transform detects columns named *_ID or *_GUID with VARCHAR2(32) or VARCHAR2(36) type and examines sample values to determine whether they are SYS_GUID values. If so, a UUID_COLUMN finding is emitted with two options: migrate as TEXT (preserving exact values) or migrate as UUID (converting format). The TEXT option is safer and is the default.
7.6 Sequence Gaps Are Normal; Plan for Them
Applications that treat sequence values as dense counters — displaying "Order #847331" and expecting the next to be "Order #847332" — will encounter gaps. Gaps arise from:
- Transaction rollbacks (NEXTVAL is never rolled back, even if the INSERT is)
- CACHE flush on restart
- Manual sequence advancement during migration
- Load testing or development inserts that were rolled back
This is pre-existing Oracle behavior. Migration does not introduce gaps, but the sequence advancement step after full load creates a deliberate gap between the last loaded value and the sequence start point. This gap is visible in reporting dashboards and can confuse operations teams who are not prepared for it.
Communication checkpoint: Before cutover, brief the operations team on the expected sequence gap. Document the pre-migration last value, the safety margin, and the post-advancement starting value. A post-cutover "missing order numbers" incident that is actually expected sequence advancement can consume significant incident response time.
7.7 Example: Multi-Table Sequence Advancement in a Financial Schema
A financial application uses sequences for five tables: account, transaction, payment, ledger_entry, and reconciliation_batch.
Oracle sequence states at full load snapshot time:
| Sequence | Last Oracle Value | Max in Target Table |
|---|---|---|
account_seq |
847,331 | 847,290 (in-flight delta) |
transaction_seq |
24,891,044 | 24,886,201 |
payment_seq |
6,102,887 | 6,101,990 |
ledger_entry_seq |
78,440,321 | 78,423,100 |
recon_batch_seq |
4,421 | 4,421 |
The safety margin is 10,000. The advancement step sets:
SELECT setval('account_seq', 857331); -- 847331 + 10000
SELECT setval('transaction_seq', 24901044); -- 24891044 + 10000
SELECT setval('payment_seq', 6112887); -- 6102887 + 10000
SELECT setval('ledger_entry_seq', 78450321); -- 78440321 + 10000
SELECT setval('recon_batch_seq', 14421); -- 4421 + 10000
During CDC catch-up (which covers the window between full load snapshot and real-time), Oracle inserts arrive in the staging area. CDC events for account show values up to 847,416 (85 new rows post-snapshot). The sequence was advanced to 857,331 — well clear of the in-flight values. CDC replay applies those rows to PostgreSQL using their Oracle-generated primary keys (not calling the sequence), so the sequence advance does not conflict with CDC replay. The sequence only activates when the application writes to PostgreSQL after cutover, at which point it starts at 857,331, safely beyond all replayed values.
This example illustrates why the safety margin must account for the entire CDC catch-up window, not just the instant of full load completion.