A complete solution for open data platforms, enterprise data catalogs, data lakes and data management. Open source, mature, fully-featured and production ready.

DataHub

• *Artifact storage*: The Pods store two kinds of data:
*Metadata:* Experiments, jobs, pipeline runs, and single scalar metrics. Metric data is aggregated for the purpose of sorting and filtering. Kubeflow Pipelines stores the metadata in a MySQL database.

Hi <@U02EHPQ70V9>: you would need to write an adapter in KubeFlow to emit metadata out to DataHub which would give you visibility across the KubeFlow ecosystem and beyond. <@U01NPT4C3KQ> <@U01SH78UJNS> and co have thought about how to do this for MLFlow and have an open PR as well! <https://github.com/linkedin/datahub/pull/2725>

<@UV0M2EB8Q> I will look into that , can you recommend folks that could help me

Hi <@U02EHPQ70V9>: besides <@U01NPT4C3KQ> you can also talk to <@U01U69UJNUF> about this

Hi, indeed we'll be happy to answer your questions if we can. (We'll finally update soon the PR on mlflow too, FYI!)