Hello! I'm new to datahub and I want to ask how t...
# ingestion
c
Hello! I'm new to datahub and I want to ask how to automate data lineage outside of apache airflow, in our case we're using Informatica to handle all of our ETL pipeline and is there a way to automate the lineage between the tasks. Thank you!
h
Datahub doesn’t have support for automatic lineage extraction from informatica yet. I’d recommend filing a feature request https://feature-requests.datahubproject.io/ here.
c
Thank you for the response, but do you know a programmatic way to implement lineage "semi-automatically", such as using python script etc?
h
Hete are some python examples to implement lineage - https://datahubproject.io/docs/lineage/sample_code/ I believe, the example lineage_dataset_job_dataset.py would be most relevant for you.
c
Awesome! Thank you so much for your help