YAML Reference

Complete schema for migration.yaml rules files.

Top-level fields

FieldTypeRequiredDescription
namestringnoHuman-readable migration name
versionstringnoRules version, surfaced in reports
sourceslistyesOne or more source tables/queries
joinslistnoJoin definitions between sources
fieldslistnoOutput field mappings (default: SELECT *)
filterstringnoRow-level filter predicate
target_tablestringnoTarget table name (override with CLI)
validationobjectnoValidation presets and settings

sources[]

sources:
  - name: orders       # logical name / alias
    table: ORDERS      # physical table name (quoted automatically)
    alias: o           # SQL alias (optional, defaults to name)
    sql: "SELECT ..."  # inline SQL override (optional)

joins[]

joins:
  - from: o                      # source alias
    to: customers                # target alias
    on: "o.CUSTOMER_ID = c.ID"   # raw ON clause (passed through)
    type: LEFT                   # INNER (default) | LEFT | RIGHT | FULL

fields[]

fields:
  - name: order_id            # output column name
    source: o.ORDER_ID        # source alias.column (quoted automatically)

  - name: total
    expression: "o.AMOUNT * 1.2"  # inline SQL expression or evaluator expression
    target_type: decimal(10,2)    # optional cast

  - name: status
    source: o.STATUS
    expression: "normalize_text(?1)"  # use ?1 for first source value

Supported expressions

ExpressionDescription
? / ?1Identity / first value
? * nArithmetic (*, +, -, /)
upper(?) / lower(?)Case transform
trim(?)Strip whitespace
normalize_text(?)trim + lowercase
coalesce(?1, ?2, 'fallback')First non-null/non-blank
nullif(?1, ?2)Null if equal
to_int(?) / to_long(?)Numeric casts
to_date(?) / to_timestamp(?)Temporal casts (ISO-8601)
to_boolean(?)Boolean coerce (y/n/1/0/true/false)
case when ?1 = 'A' then ?2 else 'N' endSimple CASE

filter

Supports: =, !=, <, <=, >, >=, IN, NOT IN, IS NULL, IS NOT NULL, AND, OR, parentheses, ISO date/time literals.

filter: "o.STATUS IN ('A','B') AND o.REGION != 'TEST' AND o.CREATED_AT > '2024-01-01'"

validation

validation:
  table_size: medium           # small | medium | large (auto-sets buckets/sample)
  buckets: 64                  # override bucket count for L1
  sample: 0.02                 # override sample fraction for L2
  count_tolerance: 0           # allowed row count diff
  key_columns: [order_id]      # pk columns for sampling hash
  presets:
    small:
      buckets: 16
      sample: 0.05
      count_tolerance: 0
    medium:
      buckets: 64
      sample: 0.02
    large:
      buckets: 128
      sample: 0.01