https://flyte.org logo
Join Slack
Powered by
# datahub-flyte
  • f

    freezing-airport-6809

    02/09/2023, 6:47 PM
    Hello all, @dry-ability-69144 here is interested in Datahub and Flyte integration. In the past - @jolly-whale-9142 and @colossal-painter-70298 have also worked on it - there are few other folks in the community. I have also added folks from LinkedIn, who might also be interested (do not know yet). @thankful-minister-83577 from my team was working with some folks, but he is out for the past 2 weeks and next 2 weeks. But, eventually he can help a lot. There is a sample that we created sometime ago, with some community folks - https://github.com/unionai/flyteevents-datahub. This is a prototype. It uses Flyte Events egress and replicates them to Datahub. You have to run the `
    Copy code
    flytelineage
    script Problems • This is not resilient, in case of failures does not do well. • We would ideally love this to be a framework - that folks can add new plugins if needed. For example Amundsen (spotify has an internal catalog) etc • Currently it simply listens to all events and keeps them in memory for a workflow and then replicates it. This is not needed, it could use flyte remote to get all the data only on receiving a terminal event. • No one uses this in production
  • f

    freezing-airport-6809

    02/09/2023, 6:48 PM
    We are not datahub experts, i think there is a channel in datahub as well to integrate with flyte
    a
    c
    • 3
    • 4
  • c

    colossal-painter-70298

    02/10/2023, 6:15 AM
    We chatted a lot about this with @thankful-minister-83577 last fall, and got to a point where I would have needed to start experimenting with implementing some rudimentary support for emitting different types of events from Flyte. Life however got in the way and I have not been able the work on this since. The idea we were discussing was indeed to emit event from Flyte that then would be translated into Datahub/Amundsen/OpenLineage/etc events by a catalog-specific translator service. The event needs have been mapped out (to some extent) in this document, but it might be quite a tall order for me to start implementing these in Flyte in my current situation. Any help on this front would be highly appreciated. I’d be happy to work on a reference event translator (for datahub), though!
    ❤️ 1
    f
    • 2
    • 7
  • d

    dry-ability-69144

    02/16/2023, 8:24 PM
    After a great call with @colossal-painter-70298, we stumbled in some situations, that I believe we can think how to solve together. Some of them: How and where Flyte send its callback functions? Is it possible to send a callback in OpenLineage format? How can we capture it on DataHub's side? Fredrik tried using a REST call, but we believe that's not the best way to do it... There where some others considerations, that Fredrik can help me to remember From there, I think we can open a new issue on Flyte's repository, is that right?
    t
    c
    f
    • 4
    • 19
  • d

    dry-ability-69144

    03/06/2023, 4:39 PM
    I had a chat with @thankful-minister-83577 last week, and we talked about sync hour agendas to see if we can talk this week. @colossal-painter-70298 when will you be available to talk? Let's try to do at a time that Yee can participate as well
    c
    t
    • 3
    • 6
  • d

    dry-ability-69144

    03/08/2023, 6:07 PM
    @thankful-minister-83577 here
  • d

    dazzling-kilobyte-30463

    04/25/2023, 1:43 AM
    @dazzling-kilobyte-30463 has left the channel
  • a

    average-finland-92144

    02/27/2025, 6:19 PM
    archived the channel