Hi all, is there a way to set a domain for a datas...
# troubleshoot
f
Hi all, is there a way to set a domain for a dataset via Python MCPW? I couldn't find anything in the examples :)
It is separate aspect that you can send like shown above
f
Awesome. Thanks for the link!
b
Let us know if you have any additional questions!
f
Hi @big-carpet-38439 I'm good for now, thank you! Although I'm wondering why it is not possible to emit such basic data as a schema/fields, description, tags, domain, .. all at once for a dataset. I dont want to end in some fundamental debate here but we're using AWS Lambda functions to push data from a system to DataHub so every other emit comes at a cost.
s
you can emit multiple MCPW in a single process. There is no restriction that you need to have separate process for each emit
i
DatahubRestEmitter().emit
takes only single instance of MCPW? How can we send bulk requests?
f
@square-activity-64562 can you give an example how to do this? All I've seen is single emits. (Even via recipes if following the logs on backend side)
s
a for loop inside the same process to call
DatahubRestEmitter().emit
? or even sequential calls like this
Copy code
emitter.emit(mcp1)
emitter.emit(mcp2)
I understand this is not calling the bulk ingest endpoint but instead
/entities?action=ingest
or
/aspects/?action=ingestProposal
Making changes in the CLI for sending to bulk ingest endpoint and adding bulk ingest for proposals has been discussed for a while but something or other kept coming up. If you would like to contribute and fast-track bulk endpoint usage, this doc might help https://datahubproject.io/docs/metadata-ingestion/developing/ and you can ask in #contribute channel for any help required.
i
got it