great-toddler-2251
08/23/2022, 9:50 PMelegant-state-4
12/20/2022, 2:56 PMelegant-state-4
01/11/2023, 9:16 PMlimited-refrigerator-50812
01/13/2023, 3:01 PMlimited-refrigerator-50812
01/13/2023, 3:04 PMlimited-refrigerator-50812
01/13/2023, 3:09 PMlimited-refrigerator-50812
01/13/2023, 3:12 PMgreat-toddler-2251
01/13/2023, 4:17 PMgreat-toddler-2251
01/13/2023, 4:24 PMmammoth-bear-12532
mammoth-bear-12532
narrow-bear-42430
02/13/2023, 3:39 PMproud-dusk-671
03/31/2023, 10:29 AMquiet-dusk-83720
04/28/2023, 7:46 AMfancy-toddler-89669
05/01/2023, 8:13 AMlimited-refrigerator-50812
05/01/2023, 10:44 AMmammoth-bear-12532
dbt
) (https://docs.getdbt.com/docs/collaborate/govern/model-access)limited-refrigerator-50812
05/16/2023, 8:42 AMdbt
terminology but goes beyond that, as we will see.
Scenario 1
There exists an output port that simply makes all the data available through a single SQL-based API. The corresponding output port describes how you can get access rights, under what conditions, who to contact with questions, etc. This information can also be captured in a data contract for this output port.
Scenario 2
Our company wants to do marketing research to predict patterns in customer behaviour. However, customer
data contains personally identifiable information (PII) protected by regulations such as GDPR, ToS, California's data protection regulation, etc. Therefore, an output port is created that exposes only the subscription
and product
tables. Moreover, some anonymisation script is run to ensure that information in the subscription
table cannot lead back to individuals.
Scenario 3
The company runs a service that allows customers
to view their subscription
details, which consumes data from this data product. The data needs to be accessible in a more timely manner than our SQL-based back-end supports so we expose it through streaming. So an output port is created that exposes the data in a kafka-stream, with its own instructions, SLA, etc.
So all three output ports expose the same data. Just different distributions of this data. I would like to describe the data on one level (e.g., the dataset or the data product level) and the output ports/ways it can be consumed on the output port level. Output port 1, and 2, expose data from the same backend/platform, but have different access rights and permitted usage (which could be captured with public/private terminology). However, output port 3 is different in a way that cannot be captured by public/private. You could address this, by saying output port 3 is a different data set and /or a different data product. However, that leads to duplicated efforts, as we would then need to describe a lot of information twice.proud-dusk-671
05/24/2023, 4:50 PMaloof-dentist-85908
05/30/2023, 10:54 AMquiet-dusk-83720
06/02/2023, 8:34 AMnutritious-orange-23137
06/20/2023, 9:45 AMnutritious-lighter-88459
07/03/2023, 3:27 PMPOST /entities/v1
endpoint along with below request body:
[
{
"entityType": "dataproduct",
"entityKeyAspect": {
"__type": "DataProductKey",
"id": "ikhatri-dp-openapi"
},
"aspect": {
"__type": "DataProductProperties",
"name": "OpenAPI Test DP",
"description": "Data Product Created Via OpenAPI",
"customProperties": {
"creation source": "openapi",
"created date": "3rd July, 2023"
},
"assets": [
{
"destinationUrn": "urn:li:dataset:(urn:li:dataPlatform:mysql,datahub.employee,PROD)",
"created": {
"time": 1688134320000,
"actor": "urn:li:corpuser:etl",
"impersonator": "urn:li:corpuser:jdoe"
},
"lastModified": {
"time": 1688134320000,
"actor": "urn:li:corpuser:etl",
"impersonator": "urn:li:corpuser:jdoe"
}
},
{
"destinationUrn": "urn:li:dataset:(urn:li:dataPlatform:mysql,datahub.department,PROD)",
"created": {
"time": 1688134320000,
"actor": "urn:li:corpuser:etl",
"impersonator": "urn:li:corpuser:jdoe"
},
"lastModified": {
"time": 1688134320000,
"actor": "urn:li:corpuser:etl",
"impersonator": "urn:li:corpuser:jdoe"
}
}
]
}
}
]
This returns me 201 response status along with the unique urn of the the created data product.
However, after this operation I am no longer able to list all data products (including the ones I created via GraphQL api). I keep getting unknown error occurred error message (PFA). Even searching doesn't work. However, when I enter the urn directly into the url as {hostname:port}/dataProduct/urn:li:dataproduct:ikhatri-dp-openapi
then I am able to view the data product. 🤔
Is there something missing in my request body or is there an issue with listing api used on UI ?blue-mechanic-1369
07/25/2023, 12:33 PMhandsome-train-99822
08/17/2023, 9:21 PMmammoth-bear-12532
quiet-dusk-83720
09/05/2023, 2:10 PMgifted-bird-57147
12/12/2023, 11:01 AMowners:
- id: urn:li:corpGroup:512344f8-3107-4194-bbb9-0c1f127a57f6
type: BUSINESS_OWNER
This is an existing group in our Datahub environment. But when I ingest the recipe the corpGroup URN is not recognized and a new user URN is created: urnlicorpuserurnlicorpGroup512344f8-3107-4194-bbb9-0c1f127a57f6.
Via the UI I am able to associate a group as owner and set an ownership type. If I then use the datahub dataproduct diff function to sync the UI changes back to the yml the ownership type is set to NONE...
Is this a know issue? I'm using cli version 0.12.0.3 on managed acryl v0.2.13.3.brash-crayon-20992
02/26/2024, 3:57 PMlimited-refrigerator-50812
03/05/2024, 3:32 PM