Thread
#getting-started
    s

    some-crayon-90964

    1 year ago
    Currently, is there a way for Datahub to allow push data instead of pull data (using crawlers)? What we are looking for is APIs that allow outside system to edit data in Datahub (such as adding new Dataset, editing existing CorpUser etc.). As what I understand, we should not allow other systems to access GMS directly, so we are wondering how people are doing the data push right now. Thanks!
    @fancy-advantage-41244 @hallowed-dinner-34937
    m

    microscopic-receptionist-23548

    1 year ago
    if you can edit those "outside systems", they can make calls to the GMS directly to ingest. As you said that isn't suggested. The more ideal solution is to have those systems also emit MCEs
    So we've shipped some crawlers by default, but you can always edit any system to emit MCEs when data is actually changed in real time
    h

    hallowed-dinner-34937

    1 year ago
    Can you explain what you mean by "emit MCEs"?
    m

    microscopic-receptionist-23548

    1 year ago
    its a kafka event
    see also
    s

    some-crayon-90964

    1 year ago
    Does that mean we need to have other systems to subscribe to Kafka?
    m

    microscopic-receptionist-23548

    1 year ago
    not subscribe (read), write to it
    yes
    those example crawler scripts we have do just that
    s

    some-crayon-90964

    1 year ago
    I see, thanks for the quick reply. We will take a look
    b

    big-carpet-38439

    1 year ago
    @some-crayon-90964 Adding on here -- Essentially you'd need extensions / hooks present in those systems you want to push from that are able to write a message to the MetadataChangeEvent Kafka topic containing their metadata 🙂
    s

    some-crayon-90964

    1 year ago
    Thanks for the information @big-carpet-38439