• Artifact storage: The Pods store two kinds of data:
Metadata: Experiments, jobs, pipeline runs, and single scalar metrics. Metric data is aggregated for the purpose of sorting and filtering. Kubeflow Pipelines stores the metadata in a MySQL database.
m
mammoth-bear-12532
09/15/2021, 6:26 PM
Hi @cuddly-postman-70897: you would need to write an adapter in KubeFlow to emit metadata out to DataHub which would give you visibility across the KubeFlow ecosystem and beyond. @witty-florist-25216@glamorous-policeman-50505 and co have thought about how to do this for MLFlow and have an open PR as well! https://github.com/linkedin/datahub/pull/2725
c
cuddly-postman-70897
09/15/2021, 6:33 PM
@mammoth-bear-12532 I will look into that , can you recommend folks that could help me
m
mammoth-bear-12532
09/29/2021, 5:54 AM
Hi @cuddly-postman-70897: besides @witty-florist-25216 you can also talk to @chilly-holiday-80781 about this
👍 1
w
witty-florist-25216
09/29/2021, 6:15 AM
Hi, indeed we'll be happy to answer your questions if we can. (We'll finally update soon the PR on mlflow too, FYI!)