Hi Team, ```from datahub.ingestion.run.pipeline im...
# getting-started
a
Hi Team,
Copy code
from datahub.ingestion.run.pipeline import Pipeline
from datahub.ingestion.run.pipeline import Pipeline

# The pipeline configuration is similar to the recipe YAML files provided to the CLI tool.
pipeline = Pipeline.create(
    {
        "source": {
            "type": "mysql",
            "config": {
                "username": "user",
                "password": "pass",
                "database": "db_name",
                "host_port": "localhost:3306",
            },
        },
        "sink": {
            "type": "datahub-rest",
            "config": {"server": "<http://localhost:8080>"},
        },
    }
)

# Run the pipeline and report the results.
pipeline.run()
pipeline.pretty_print_summary()
I'm using above code to ingest data into datahub, is there a way to add cloumn description while ingesting the data??
📖 1
l
Hey there 👋 I'm The DataHub Community Support bot. I'm here to help make sure the community can best support you with your request. Let's double check a few things first: ✅ There's a lot of good information on our docs site: www.datahubproject.io/docs, Have you searched there for a solution? ✅ button 2️⃣ It's not uncommon that someone has run into your exact problem before in the community. Have you searched Slack for similar issues? Yes button Did you find a solution to your issue?
a
No, I couldn't find solution.
m
Where do the descriptions come from?
a
I've a SAP table and it has the description for the columns
m
I am not sure about SAP connector, but with snowflake for example, it can extract descriptions of tables and columns
a
ok, actually I'm trying to ingest tables which are standard SAP tables from MSSQL to datahub. I extracted description of the columns from browser and I'm want to add the description of the columns during ingestion.
m
Where do the description reside right now? In SAP table? In MSSQL?
a
It's in an excel file
m
You should use this as a separate ingestion https://datahubproject.io/docs/generated/ingestion/sources/csv/
a
Thank you. But, to do so I need to use CLI which need access token for that and access token feature is diabled in my datahub. I've my datahub deployed on docker in Linux VM. So, I chose python approach.
m
You can do exactly the same thing through python pipeline
a
can you explain how?
m
In your python code, create another pipeline to ingest the csv.
a
Copy code
from datahub.ingestion.run.pipeline import Pipeline
pipeline = Pipeline.create(
    {
        "source":{
            "type":"csv-enricher"
            "config":{
                "filename":"file path"
                "should_overwrite": "false",
                "delimeter":",",
                "array_delimeter":"|",
        "sink": {
            "type": "datahub-rest",
            "config": {"server": "<http://localhost:8080>"},
            },
        },
        },
    }
)
# Run the pipeline and report the results.
pipeline.run()
pipeline.pretty_print_summary()
Is this the approach you are talking about??
m
Yes thats correct. Make sure you populate with real values
a
ok thank you. I got this error ConfigurationError: Cannot read remote file file/path/.csv, error:No connection adapters were found for 'file/path/.csv' what does it mean and how to resolve it?
m
You have invalid config. It is trivial, you should be able to debug. Your variables arent set properly.
a
This is the code and I'm getting ConfigurationError: Cannot read remote file <file///C/Users/kallu/.vscode/SAP_TABLES_STRUCTURE|C:/Users/kallu/.vscode/SAP_TABLES_STRUCTURE>(MYK).csv, error:No connection adapters were found for '<file///C/Users/kallu/.vscode/SAP_TABLES_STRUCTURE|C:/Users/kallu/.vscode/SAP_TABLES_STRUCTURE>(MYK).csv' for that code
m
Do you have the correct format?
Go to the page i sent earlier, there is an example of the csv file
a
Ok
a
Thanks for all the help @modern-artist-55754! Let me know if this is still affecting you after you’ve followed Steve’s suggestions @average-lock-95905
a
@astonishing-answer-96712 Yeah it helped.