Quickstart
Get your first CSV → PostgreSQL migration running in under 10 minutes. The complete, runnable example is at pulsaride-example/quickstart/ in this repository — clone it, start docker compose, run the app.
What you will build
A Spring Boot application that reads a CSV file, applies field-level transformations and a row filter declared in a YAML config, and upserts the result into PostgreSQL. Re-runs are safe — INSERT … ON CONFLICT DO UPDATE guarantees idempotency.
The same pattern extends naturally to Oracle → PostgreSQL migrations, CDC mode, multi-table runs, and deferred FK resolution. The quickstart removes everything except the core moving parts.
1. Add the dependency
<dependency> <groupId>com.pulsaride</groupId> <artifactId>pulsaride-transform</artifactId> <version>2.0.0-SNAPSHOT</version> </dependency>
Requires Java 21+ and Spring Boot 3.x.
2. Write the migration YAML
Create migration/01-products.yaml. One file per table. This is the single source of truth for the mapping — no Java code encodes these rules.
name: "products-migration"
version: "1.0"
target_table: "products"
# Only rows matching this predicate are written to the target.
# Supports: =, !=, <, >, IN, NOT IN, IS NULL, IS NOT NULL, AND, OR
filter: "STATUS = 'ACTIVE'"
sources:
- name: src
type: csv
file: "data/products.csv" # filesystem path, relative to working directory
fields:
# Direct pass-through: source column → target column
- name: "id"
source: "PRODUCT_ID"
# Expression: built-in functions (upper, lower, trim, concat, nvl, hash, …)
- name: "name"
expression: "upper(PRODUCT_NAME)"
- name: "price"
source: "UNIT_PRICE"
- name: "category"
source: "CATEGORY"
reject_policy: "SKIP" # SKIP | FAIL | ABORT3. Configure application.yml
The migration.tables list declares every table to migrate and the path to its YAML config. The datasource is standard Spring Boot — no Pulsaride-specific connection properties.
spring:
datasource:
url: jdbc:postgresql://localhost:5432/mydb
username: ${DB_USER:user}
password: ${DB_PASS:password}
main:
web-application-type: none # no HTTP server needed for batch
migration:
tables:
- table: products
config: migration/01-products.yaml
batch-size: 5004. Three Java files — no business logic in Java
| File | Lines | Purpose |
|---|---|---|
QuickstartApp.java | ~10 | @SpringBootApplication @EnablePulsarideTransform @ConfigurationPropertiesScan |
MigrationProperties.java | ~30 | Binds the migration.* YAML block via @ConfigurationProperties |
MigrationRunner.java | ~100 | CommandLineRunner — the wiring between library classes |
MigrationRunner calls three library methods. Everything else — filtering, transformation, upsert logic — runs inside the library:
// 1. Parse the YAML config into a typed object
MigrationConfig config = MigrationLoader.load(Path.of("migration/01-products.yaml"));
// 2. Build the processor: applies filter + field mappings + expressions
PulsarideEventProcessor processor =
new PulsarideEventProcessor(config, new DebeziumEventNormalizer());
// 3. For each source row: transform, then write if not filtered out
for (Map<String, Object> row : readCsv(config)) {
Map<String, Object> out = processor.process(row, null);
if (out != null) { // null = excluded by the filter: rule
writer.upsertAll(config.getTarget_table(), List.of(out), List.of("id"));
}
}5. Run it
# 1. Start PostgreSQL (schema is initialised from sql/01-init.sql) docker compose up -d # 2. Run the migration mvn spring-boot:run
Expected output:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Pulsaride Quickstart — [products] ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ CSV 'data/products.csv' — 20 row(s) loaded [products] read=20, written=16, filtered-out=4 ✓ products — 16 rows written
16 rows written, 4 silently filtered (STATUS != 'ACTIVE'). Re-run: same result, no duplicates.
YAML quick reference
| Key | Description |
|---|---|
target_table | Target table name in PostgreSQL |
filter | Row predicate — non-matching rows are skipped silently |
sources[].type | csv | jdbc | staging | event |
sources[].file | Path to CSV file (type: csv only) |
sources[].table | Source table name (type: jdbc only) |
fields[].name | Target column name |
fields[].source | Source column — direct copy |
fields[].expression | Transform expression: upper(X), trim(X), nvl(X,'default'), hash(X,'SHA-256'), … |
reject_policy | SKIP (default) | FAIL | ABORT |
Next steps
- JDBC source (Oracle / PostgreSQL) — change
sources[].typetojdbcand add atable:orsql:field. Wire a secondDataSourcebean for the source DB. See Spring Integration. - FK dependencies across tables — declare
depends_on:in the YAML; the library defers rows with unresolved FK references and replays them automatically. - Full framework (health, DLQ, CDC, monitoring) — add
@EnablePulsarideTransformto your main class and follow thehealthcare-migration-serviceexample, which extends this exact pattern. - YAML Reference — all fields, filter operators, expressions, lookups, and declarative steps.