How to override a stream's primary key properties or incremental replication key in Meltano?
Asked Answered
C

2

6

There are times when we want to override a stream's key properties (primary key) or it's incremental replication key. What's the best way to do this in Meltano?

Cobnut answered 29/3, 2022 at 22:43 Comment(0)
C
8

In your Meltano yaml file, you can add a table-key-properties and/or replication-key override using the metadata extra config.

Docs reference: https://docs.meltano.com/concepts/plugins#metadata-extra

Example yaml code:

extractors:
- name: tap-postgres
  metadata:
    some_stream_id:
      table-key-properties: [id]
      replication-key: created_at
      replication-method: INCREMENTAL

This example sets the primary key to id and the replication key to created_at.

Update regarding wildcards (2022-03-30)

Per @visch's comment, wildcards can also be used in stream names to match multiple streams at once. Such as in these example:

extractors:
- name: tap-postgres
  metadata:
    "*":
      # set all streams to "full table" mode
      replication-method: FULL_TABLE

And:

extractors:
- name: tap-postgres
  metadata:
    "*":
      # ignore primary keys for all streams
      table-key-properties: []
Cobnut answered 29/3, 2022 at 22:43 Comment(3)
For testing and just starting with singer or a new tap I like to use metadata: '*': replication-method: FULL_TABLE As that way I can just see the data and figure out the replication stuff later. Realized the comment is hard to read, gitlab.com/vischous/oracle2mssql/-/blob/master/oracle2mssql/… is much easierShe
Thanks, @visch! I've updated my answer using this example.Cobnut
@Cobnut can you please correct your answer? Apparently, it was renamed from key-properties to table-key-properties. At least in version 2.7.0Gag
B
1

After having followed @visch suggestion by using

  metadata:
    "*":
      replication-method: FULL_TABLE

from the logs (when running Meltano in debug mode with --log-level=debug) in discovery mode you should see something like:

metadata node for tap_stream_id 'mydb-mytable'

From the above you can get the correct collection name and then structure the meltano.yml accordingly:

extractors:
- name: tap-postgres

  select:
  - mydb-mytable.*

  metadata:
    mydb-mytable:
      table-key-properties: [id]
      replication-method: INCREMENTAL
      replication-key: created_at

In particular it was very important for me to add the select field as otherwise I was getting this error message:

ERROR message=Invalid replication method None! 
replication method needs to be either FULL_TABLE or INCREMENTAL 
Bathroom answered 24/4, 2023 at 16:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.