Quickstart

Get your first CSV → PostgreSQL migration running in under 10 minutes. The complete, runnable example is at pulsaride-example/quickstart/ in this repository — clone it, start docker compose, run the app.

What you will build

A Spring Boot application that reads a CSV file, applies field-level transformations and a row filter declared in a YAML config, and upserts the result into PostgreSQL. Re-runs are safe — INSERT … ON CONFLICT DO UPDATE guarantees idempotency.

The same pattern extends naturally to Oracle → PostgreSQL migrations, CDC mode, multi-table runs, and deferred FK resolution. The quickstart removes everything except the core moving parts.

1. Add the dependency

<dependency>
  <groupId>com.pulsaride</groupId>
  <artifactId>pulsaride-transform</artifactId>
  <version>2.0.0-SNAPSHOT</version>
</dependency>

Requires Java 21+ and Spring Boot 3.x.

2. Write the migration YAML

Create migration/01-products.yaml. One file per table. This is the single source of truth for the mapping — no Java code encodes these rules.

name:         "products-migration"
version:      "1.0"
target_table: "products"

# Only rows matching this predicate are written to the target.
# Supports: =, !=, <, >, IN, NOT IN, IS NULL, IS NOT NULL, AND, OR
filter: "STATUS = 'ACTIVE'"

sources:
  - name: src
    type: csv
    file: "data/products.csv"   # filesystem path, relative to working directory

fields:
  # Direct pass-through: source column → target column
  - name:   "id"
    source: "PRODUCT_ID"

  # Expression: built-in functions (upper, lower, trim, concat, nvl, hash, …)
  - name:       "name"
    expression: "upper(PRODUCT_NAME)"

  - name:   "price"
    source: "UNIT_PRICE"

  - name:   "category"
    source: "CATEGORY"

reject_policy: "SKIP"   # SKIP | FAIL | ABORT

3. Configure application.yml

The migration.tables list declares every table to migrate and the path to its YAML config. The datasource is standard Spring Boot — no Pulsaride-specific connection properties.

spring:
  datasource:
    url:      jdbc:postgresql://localhost:5432/mydb
    username: ${DB_USER:user}
    password: ${DB_PASS:password}
  main:
    web-application-type: none   # no HTTP server needed for batch

migration:
  tables:
    - table:  products
      config: migration/01-products.yaml
  batch-size: 500

4. Three Java files — no business logic in Java

File	Lines	Purpose
`QuickstartApp.java`	~10	`@SpringBootApplication @EnablePulsarideTransform @ConfigurationPropertiesScan`
`MigrationProperties.java`	~30	Binds the `migration.*` YAML block via `@ConfigurationProperties`
`MigrationRunner.java`	~100	`CommandLineRunner` — the wiring between library classes

MigrationRunner calls three library methods. Everything else — filtering, transformation, upsert logic — runs inside the library:

// 1. Parse the YAML config into a typed object
MigrationConfig config = MigrationLoader.load(Path.of("migration/01-products.yaml"));

// 2. Build the processor: applies filter + field mappings + expressions
PulsarideEventProcessor processor =
    new PulsarideEventProcessor(config, new DebeziumEventNormalizer());

// 3. For each source row: transform, then write if not filtered out
for (Map<String, Object> row : readCsv(config)) {
    Map<String, Object> out = processor.process(row, null);
    if (out != null) {   // null = excluded by the filter: rule
        writer.upsertAll(config.getTarget_table(), List.of(out), List.of("id"));
    }
}

5. Run it

# 1. Start PostgreSQL (schema is initialised from sql/01-init.sql)
docker compose up -d

# 2. Run the migration
mvn spring-boot:run

Expected output:

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Pulsaride Quickstart — [products]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  CSV 'data/products.csv' — 20 row(s) loaded
  [products] read=20, written=16, filtered-out=4
  ✓ products — 16 rows written

16 rows written, 4 silently filtered (STATUS != 'ACTIVE'). Re-run: same result, no duplicates.

YAML quick reference

Key	Description
`target_table`	Target table name in PostgreSQL
`filter`	Row predicate — non-matching rows are skipped silently
`sources[].type`	`csv` \| `jdbc` \| `staging` \| `event`
`sources[].file`	Path to CSV file (type: csv only)
`sources[].table`	Source table name (type: jdbc only)
`fields[].name`	Target column name
`fields[].source`	Source column — direct copy
`fields[].expression`	Transform expression: `upper(X)`, `trim(X)`, `nvl(X,'default')`, `hash(X,'SHA-256')`, …
`reject_policy`	`SKIP` (default) \| `FAIL` \| `ABORT`

Next steps

JDBC source (Oracle / PostgreSQL) — change sources[].type to jdbc and add a table: or sql: field. Wire a second DataSource bean for the source DB. See Spring Integration.
FK dependencies across tables — declare depends_on: in the YAML; the library defers rows with unresolved FK references and replays them automatically.
Full framework (health, DLQ, CDC, monitoring) — add @EnablePulsarideTransform to your main class and follow the healthcare-migration-service example, which extends this exact pattern.
YAML Reference — all fields, filter operators, expressions, lookups, and declarative steps.