During ingestion, how would I go iterating through...
# ingestion
f
During ingestion, how would I go iterating through the Datasets that all came from one Postgres cluster? ie the Datasets reflect the Tables, but I want to be able to update all necessary tables during ingestion with one script. The main issue i see is that another postgres cluster, with its
postgres
database, could output the URN (urnlidataPlatform:postgres,main.public.customers,PROD). How do I handle this overlap in Datahub so I can remember which Datasets come from which postgres cluster?
l
m
@faint-television-78785 you can solve your particular problem using Swaroop advice. If you still need to iterate through all the datasets and make change during ingestion, you can use Transformer https://datahubproject.io/docs/metadata-ingestion/transformers/
f
Thank you @loud-island-88694 @modern-artist-55754 that answers my question! If the
platform_instance
field was added to the sample configs in
metadata-ingestion/examples/recipes
it might help users
l
Glad it helped! Feel free to open a PR with an example recipe - will help others encountering this!