I'm ingesting data from dbt catalog from datahub b...
# ingestion
a
I'm ingesting data from dbt catalog from datahub but it is failing with "ERROR {datahub.ingestion.run.pipeline:52} - failed to write record with workunit urnlidataset". I dont understand the issue here. Appreciate any help!
g
Hey @astonishing-mechanic-42915! Let me see here.. do you see other workunits ingested successfully? I wonder if this is the connector trying to ingest a table without a name
a
Hey @green-football-43791, it seems all the workunits ingestion is failing.
g
got it, all with the same error message? is there anything else in the error logs?
a
yes with the same error and see {'e': JSONDecodeError('Expecting value: line 1 column 1 (char 0)')}, at the end
g
could you try writing dbt source to file rather than to datahub? if that is successful, we may be able to inspect the mces and see how they are misformatted
you could follow this as an example:
a
sure, thank you @green-football-43791
I guess it is working with the file option.
Sink report: {'failures': [], 'records_written': 2360, 'warnings': []}
g
oh, great. would you mind sharing some snapshots in the file?
if there is sensitive information, feel free to x any identifying names out
the structure would be the key issue
a
here you go
g
ok- that seems good
I wonder if the issue is with your sink.
Could you try emitting a sample mce file there?
this should always work ^
a
it is the same error.
[2021-06-03 183439,018] ERROR {datahub.ingestion.run.pipeline:52} - failed to write record with workunit file///Users/vnallasami/Docker/datahub/metadata ingestion/examples/mce files/bootstrap mce.json0 with Expecting value: line 1 column 1 (char 0) and info {}
g
Hmm- it seems like there may be an issue with your connection to GMS
how are you running Datahub?
is everything local on your computer?
are you deployed in kubernetes?
a
I'm using preinstalled docker containers for datahub
g
using
./quickstart
?
a
yes
i see, can you try running
datahub check local-docker
?
let's make sure you environment is in a good state
a
It seems that the containers are in good state and I was able to ingest sample data and access the web UI
g
interesting- you saw the error ingesting sample data earlier
was that resolved, or was the data ingested regardless of the error?
a
It was done after I set up the containers
g
how recently was that?
a
sample data ingestion is working , I tried it now
./docker/ingestion/ingestion.sh
g
I see- so the error you saw before is gone? did you change anything?
a
It is working for only sample data, not for dbt
g
i wonder if there is a difference in the sink you have for dbt
what are the configurations for the sink?
a
sample data ingestion is working from the begining
1
sink: type: "datahub-rest" config: server: 'http://localhost:9002'
I guess sink configuration looks good
g
ah- what if you change it to 8080?
9002 is the datahub-frontend, but GMS handles ingestion
a
ok
g
frontend is 9002 and gms is 8080
a
Wow, that was the issue 🙂. really appreciate your patience and help! thank you so much!
🚀 1
g
wahoo!
great to hear 😄
how does your data look? is it what you expect?
a
pipeline finished successfully but I dont see anything in the web UI
g
you might need to give it a minute for it to be indexed by search
a
got it, it shows up now
g
great!
a
thank you @green-football-43791
g
happy to help 🙂
I would also recommend checking out your datasets' lineage graphs by clicking the graph button in the top right corner of a dbt datasets page
Dbt sources generally product high quality lineage graphs
a
Just verified it, it is working. thank you so much 🙂
🎉 1