Hi, I'm trying to ingest lineage data following th...
# ingestion
i
Hi, I'm trying to ingest lineage data following the recently updated guide here: https://github.com/linkedin/datahub/tree/master/metadata-ingestion#using-datahubs-airflow-lineage-backend-recommended I've set up the hook in step 1 for the REST endpoint at http://localhost:8080 . I've also added the backend config to my airflow.cfg file. Running the DAG in step 3 seems to have no effect in Datahub - https://github.com/linkedin/datahub/blob/master/metadata-ingestion/examples/airflow/lineage_backend_demo.py Using the Datahub Emitter works fine though: https://github.com/linkedin/datahub/blob/master/metadata-ingestion/examples/airflow/lineage_backend_demo.py I see the first one is the recommended approach, which uses inlets and outlets. Do the targets of those inlets/outlets have to exist in Datahub already for them to work? Also, when using this approach, should I see any activity in the Airflow log indicating data has been sent to Datahub. I don't see anything, but I do when using the Datahub Emitter.
b
Hi, did you manage to ingest lineage data with the recommended approach? I’m struggling to grasp the concept around inlets and outlets. Are they hardcoded values or should they be somehow initialised in the dags? What is the naming convention behind them? As you asked, should they exist in Datahub already for them to work? In hope you’ll see this reply 😄 , thank you!