Hey Team For a PoC project I spinned of MSK OpenSearch RDS P DataHub #getting-started

Hey Team....For a PoC project I spinned of MSK, Op...

nutritious-bird-77396

11/21/2021, 3:02 PM

Hey Team....For a PoC project I spinned of MSK, OpenSearch, RDS (PostGres) and schema registry and ran docker processes

datahub-frontend

and

datahub-gms

After that we ingested metadata from a local Redshift cluster using datahub ingestion recipe (source as redshift and sink as

datahub-kafka

) I was under the impression I might have to run MCE process to pickup the messages from Kafka to push to GMS but that wasn't necessary, I could already see the messages in frontend and Datastore (postgres).. Could someone explain this?

nutritious-bird-77396

11/21/2021, 3:03 PM

Is the metadata-ingestion pushing to datahub-gms api endpoint? Or

datahub-gms

reads from kafka?

square-activity-64562

11/21/2021, 3:08 PM

There are some settings to disable the process inside gms. By default the 2 kafka processors do run in gms

nutritious-bird-77396

11/21/2021, 3:26 PM

I see dependencies on mae-consumer and mce-consumer here - https://github.com/linkedin/datahub/blob/master/metadata-service/war/build.gradle#L10 That's definitely convenient. Cool!

early-lamp-41924

11/22/2021, 8:48 AM

We wanted to make it easy to deploy while keeping the ability to spin them up in different pods for future scalability! In helm, this is triggered by this flag https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/values.yaml#L63 By default, it is set to false, meaning that we will spin up the consumers within the gms pods!

nutritious-bird-77396

11/22/2021, 2:56 PM

Thanks @early-lamp-41924!

Open in Slack

Previous Next