Currently, is there a way for Datahub to allow pus...
# getting-started
s
Currently, is there a way for Datahub to allow push data instead of pull data (using crawlers)? What we are looking for is APIs that allow outside system to edit data in Datahub (such as adding new Dataset, editing existing CorpUser etc.). As what I understand, we should not allow other systems to access GMS directly, so we are wondering how people are doing the data push right now. Thanks!
@fancy-advantage-41244 @hallowed-dinner-34937
m
if you can edit those "outside systems", they can make calls to the GMS directly to ingest. As you said that isn't suggested. The more ideal solution is to have those systems also emit MCEs
So we've shipped some crawlers by default, but you can always edit any system to emit MCEs when data is actually changed in real time
h
Can you explain what you mean by "emit MCEs"?
m
its a kafka event
see also
s
Does that mean we need to have other systems to subscribe to Kafka?
m
not subscribe (read), write to it
yes
those example crawler scripts we have do just that
s
I see, thanks for the quick reply. We will take a look
b
@some-crayon-90964 Adding on here -- Essentially you'd need extensions / hooks present in those systems you want to push from that are able to write a message to the MetadataChangeEvent Kafka topic containing their metadata 🙂
s
Thanks for the information @big-carpet-38439