https://datahubproject.io logo
Join Slack
Powered by
# integrate-flyte-datahub
  • h

    high-hospital-85984

    05/27/2022, 3:55 PM
    In order to not grow the one thread any longer than it already is, let's move the discussion regarding Datahub support in Flyte to this channel. We had a first chat with @modern-pilot-97597 today, where I explained what we (as users of both Flyte and Datahub) would like to see from the integration. We threw around some ideas, which would need some further distilling on the Flyte end, but the problem as a whole seems feasible to solve. I'll let the Flyte peeps explain the functionality they're planning that could support this use case.
    teamwork 3
  • b

    billions-lawyer-59647

    06/10/2022, 5:51 AM
    @high-hospital-85984 are there any updates on this?
  • b

    billions-lawyer-59647

    06/10/2022, 5:51 AM
    what are the next steps
  • b

    billions-lawyer-59647

    06/10/2022, 5:52 AM
    Yee is out for this week and next
  • h

    high-hospital-85984

    06/10/2022, 7:36 AM
    Not that I know of @billions-lawyer-59647 . But I think you guys had some ideas about using some proto events that could potentially be used for this? Has that evovled in any way? Otherwise I think it would be nice to get a RFC on Flyte's side for how to support this feature. I'm super happy to partake as a "user" stakeholder, and once the plan is in place we can most probably allocate some resources to help with the implementation where we can.
  • b

    billions-lawyer-59647

    06/10/2022, 4:20 PM
    cc @billions-apartment-18513 / @modern-pilot-97597 will know more let me follow up
  • m

    modern-pilot-97597

    07/05/2022, 4:08 PM
    from jonathan lamiel following the meeting this morning
    https://www.metaintegration.net/ this is the data catalog I was referring to. Most big ETL vendors are selling it as their data catalog (I was at Talend). There we had a 2 way integration. meta integration have crawlers to harvest datasets a bit everywhere, and we use that within our tool too. (Kind of a dataset reference) for data lineage we mainly were sending events directly to meta-integration through API at runtime nothing really fancy.
    https://www.alation.com/ this was our biggest competitors
    And the one we are using internally, I don’t have access to it…, but it’s Onetrust
  • h

    high-hospital-85984

    07/06/2022, 6:48 PM
    Realised we havent linked the document (very much WIP still): https://docs.google.com/document/d/1YMZkJr77Kg0IIJssFxs-QFMf2mpSEHIIP0hTVokqywY/edit
  • m

    modern-pilot-97597

    07/15/2022, 6:51 PM
    hey - thinking more about the suggestions in the doc, it sounds like you’re advocating for a 100% manually user-produced event stream. but a lot of things really do happen behind the scenes, is there really nothing useful to datahub that can come from the Flyte platform? for example, flyte tasks/workflows. at registration time, when a new version of a task or a task is registered for the first time, is there nothing useful that can be sent to datahub? flyte can emit an event for the creation of each task/workflow.
  • m

    modern-pilot-97597

    07/15/2022, 6:53 PM
    wrt the manual part of it, for sure that needs to be part of the experience
  • m

    modern-pilot-97597

    07/15/2022, 6:55 PM
    will add a section for that.
  • b

    best-lamp-53937

    08/02/2023, 2:09 AM
    @strong-architect-67189
  • b

    bulky-shoe-65107

    10/16/2023, 12:35 AM
    has renamed the channel from "flyte-datahub-integration" to "integrate-flyte-datahub"