rapid-sundown-8805
07/07/2021, 1:18 PMFederated Metadata Serving
DataHub comes with a single metadata service (gms) as part of the open source repository. However, it also supports federated metadata services which can be owned and operated by different teams –– in fact that is how LinkedIn runs DataHub internally. The federated services communicate with the central search index and graph using Kafka, to support global search and discovery while still enabling decoupled ownership of metadata. This kind of architecture is very amenable for companies who are implementing data mesh.Do you have an example architecture for this kind of setup? What is it about having a central metadata repository that goes against data mesh principles? Is it the downstream integrations (mce events etc.)?
mammoth-bear-12532
steep-van-9393
07/09/2021, 9:09 AMambitious-airline-8020
07/09/2021, 11:38 AMmysql-setup
container logs - hope it helpssticky-television-18623
07/09/2021, 2:14 PMrich-policeman-92383
07/12/2021, 12:26 PMcrooked-toddler-8683
07/12/2021, 8:38 PMrich-policeman-92383
07/13/2021, 5:48 AMrapid-sundown-8805
07/13/2021, 7:48 AMambitious-airline-8020
07/13/2021, 8:29 AMHistorical roadmap
- part No-code Metadata Model Additions
- I see that No need to write any code (in GraphQL or UI) to visualize metadata
is not checked while stay in Historical part.
Does that mean abandoned
, or just delayed?brief-lizard-77958
07/13/2021, 8:46 AMTask metadata ingestioninstallDev FAILEDFAILURE: Build failed with an exception. * What went wrong: Execution failed for task 'metadata ingestioninstallDev'.
Process 'command 'venv/bin/pip'' finished with non-zero exit value 1Has anyone encountered a similar problem? Edit: I had to separately install python-ldap, which can't be standardly installed in Ubuntu (https://stackoverflow.com/a/4768467/7615751)
astonishing-yak-92682
07/13/2021, 4:22 PMcurved-magazine-23582
07/14/2021, 3:49 AMStarting upgrade with id NoCodeDataMigration...
Cleanup has not been requested.
Skipping Step 1/7: RemoveAspectV2TableStep...
Executing Step 2/7: GMSQualificationStep...
Completed Step 2/7: GMSQualificationStep successfully.
Executing Step 3/7: UpgradeQualificationStep...
-- V1 table exists
-- V1 table has 8011 rows
-- V2 table exists
-- V2 table has 2 rows
-- Since V2 table has records, we will not proceed with the upgrade.
-- If V2 table has significantly less rows, consider running the forced upgrade.
Failed to qualify upgrade candidate. Aborting the upgrade...
Step with id UpgradeQualificationStep requested an abort of the in-progress update. Aborting the upgrade...
Upgrade NoCodeDataMigration completed with result ABORTED. Exiting...
How do I do that recommended forced upgrade
in this case?jolly-honey-27198
07/14/2021, 8:06 AMacceptable-architect-70237
07/14/2021, 4:45 PMbetter-orange-49102
07/15/2021, 6:21 AMsquare-activity-64562
07/15/2021, 5:41 PMsquare-activity-64562
07/15/2021, 5:44 PMglobal.datahub_standalone_consumers_enabled = true
then even if datahub-mae-consumer.enabled = false
they get deployed. It will get confusing looking at these property names about what is supposed to be done.
Should I keep
global.datahub_standalone_consumers_enabled = false
datahub-mae-consumer.enabled = true
datahub-mce-consumer.enabled = true
or all 3 should be made true?square-activity-64562
07/15/2021, 6:17 PMdatahub ingest
command mentioned in https://datahubproject.io/docs/metadata-ingestion find datahub's kafka or rest endpoint? The use case is that I am thinking of running it via jenkins for now. Our jenkins will create a pod in K8s and run it. Jenkins will create a pod in jenkins namespace of our K8s cluster. Datahub is in apps namespace of our K8s cluster. So I am not sure how to configure datahub ingest so that it knows the location of datahub gms and frontend.gifted-arm-43579
07/16/2021, 6:55 AMambitious-airline-8020
07/16/2021, 8:27 AMclean-furniture-99495
07/16/2021, 9:11 AMsquare-activity-64562
07/16/2021, 9:52 AMsquare-activity-64562
07/16/2021, 2:04 PMsquare-activity-64562
07/16/2021, 2:06 PMsquare-activity-64562
07/16/2021, 6:31 PMcurved-magazine-23582
07/18/2021, 11:35 PMGMS logs:
17:12:07.872 [qtp544724190-3515] INFO c.l.m.r.entity.EntityResource - GET urn:li:corpuser:datahub
17:12:07.875 [pool-9-thread-1] INFO c.l.metadata.filter.LoggingFilter - GET /entities/urn%3Ali%3Acorpuser%3Adatahub - get - 200 - 3ms
17:12:07.882 [I/O dispatcher 1] INFO c.l.m.k.e.ElasticsearchConnector - Successfully feeded bulk request. Number of events: 1 Took time ms: -1
17:12:08.359 [qtp544724190-3397] INFO c.l.m.r.entity.EntityResource - BATCH GET [urn:li:corpuser:datahub]
17:12:08.363 [pool-9-thread-1] INFO c.l.metadata.filter.LoggingFilter - GET /entities?ids=List(urn%3Ali%3Acorpuser%3Adatahub) - batchGet - 200 - 4ms
salmon-cricket-21860
07/19/2021, 3:44 AMDataHubUsageEvent_v1
?
Was able to modified other topic names but failed to change it even w/ DATAHUB_USAGE_EVENT_NAME
env variable. Seems ``DataHubUsageEvent_v1` is automatically created when user activities occur.
DataHubUsageEvent_v1
catalog-datahub-fmce
catalog-datahub-mae
catalog-datahub-mce
catalog-datahub-usage # created by kafka-setup w/ `DATAHUB_USAGE_EVENT_NAME` ENV
__consumer_offsets
_schemas
square-activity-64562
07/21/2021, 7:16 PMsome-microphone-33485
07/21/2021, 7:17 PM