Hey Team, I’ve been trying to run ingestion using ...
# troubleshoot
s
Hey Team, I’ve been trying to run ingestion using python script like this - https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/examples/library/programatic_pipeline.py Does it support when config_dict has env variables instead of explicitly inserted values? Something like this?
Copy code
from datahub.ingestion.run.pipeline import Pipeline

# The pipeline configuration is similar to the recipe YAML files provided to the CLI tool.
pipeline = Pipeline.create(
    {
        "source": {
            "type": "mysql",
            "config": {
                "username": "user",
                "password": "pass",
                "database": "db_name",
                "host_port": "localhost:3306",
            },
        },
        "sink": {
            "type": "datahub-rest",
            "config": {"server": "${DATAHUB_GMS_URL}",},
        },
    }
)

# Run the pipeline and report the results.
pipeline.run()
pipeline.pretty_print_summary()
1
I’ve been trying to run like this, but always throws an error and I’m wondering if by any chance there’s a way to do it?
b
What's wrong with using os.environ["myvar"] to pull in the variable values in this case?
s
I think this way is clunky, I don’t want to create configs containing python packages and methods. If anyone would be looking for solution, I recommend to use
*from* datahub*.*configuration*.*config_loader *import* load_config_file
and load config from file path. Then even if it contains env variables defined as $SOME_KEY, it will substitute it with env variables. Topic closed.