Hi folks, So I was trying to setup Datahub and I w...
# getting-started
m
Hi folks, So I was trying to setup Datahub and I was successful (using Docker containers). So now, I am trying to setup a airflow-lineage-DAG in airflow which is running in conatiner as well on some other port. (Airflow). But while I add the DAG file mentioned here, I am getting a ModuleNotFound Error. Can anyone help me what exactly am I doing wrong here that my DAG import is giving error? or its it some issue with the python module used by Datahub? The DAG I'm trying to import : https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub_provider/example_dags/mysql_sample_dag.py
g
Have you made sure that
acryl-datahub[mysql]
is installed in your airflow environment? Could you paste the full details of the error here?
pasted the wrong link, my bad
by installing it in the "airflow environment" do you mean I should install it in the airflow-webserver container? because as I run airflow, it runs multiple containers
g
Yes, the acryl-datahub pip package will need to be installed in each of your airflow containers (this is actually standard for airflow, not datahub specific)
Generally people use a custom base image with combines airflow with all the dependencies they need