big-carpet-38439
09/21/2021, 4:17 PMmammoth-bear-12532
bland-orange-13353
09/22/2021, 6:57 PMbrief-insurance-68141
09/22/2021, 7:00 PMsilly-umbrella-20605
09/22/2021, 8:34 PMnutritious-bird-77396
09/23/2021, 3:53 PMdatasets
/ models
specific to a business domain?witty-keyboard-20400
09/24/2021, 8:44 AMwitty-keyboard-20400
09/27/2021, 5:13 AMbrief-cricket-98290
09/27/2021, 10:31 AMwitty-keyboard-20400
09/27/2021, 1:53 PMacceptable-architect-70237
09/27/2021, 11:11 PMGraphService
for neo4j and es separately, I didn't see there is obviously performance gain if with ES.witty-keyboard-20400
09/28/2021, 10:23 AM"urn": "urn:li:corpuser:datahub"
? Considering that URNs are in the format urn:<Namespace>:<Entity Type>:<ID>
, where is the entity type corpuser
defined?
What is the significance of the "Snapshot" suffix, because there is no timestamp field in this entire section?
{
"auditHeader": null,
"proposedSnapshot": {
"com.linkedin.pegasus2avro.metadata.snapshot.CorpUserSnapshot": {
"urn": "urnlicorpuser:datahub",
"aspects": [
{
"com.linkedin.pegasus2avro.identity.CorpUserInfo": {
"active": true,
"displayName": {
"string": "Data Hub"
},
"email": "datahub@linkedin.com",
"title": {
"string": "CEO"
},
"managerUrn": null,
"departmentId": null,
"departmentName": null,
"firstName": null,
"lastName": null,
"fullName": {
"string": "Data Hub"
},
"countryCode": null
}
}
]
}
},
"proposedDelta": null
}witty-keyboard-20400
09/29/2021, 8:32 AM@ReadOnly
should only be used to enforce that an optional field is not present. It should not be specified for a required field, making missing required field value valid."
It felt awkward.
In fact I didn't understand these 2 sentences. It's hard for me to align my general understanding with this seemingly new way to describe ReadOnly and CreateOnly validations.witty-keyboard-20400
09/29/2021, 12:14 PMcurl --location --request GET '<http://localhost:8080/entities/urn%3Ali%3Achart%3Acustomers>'
is for specific type of entity - Chart.
To get list of all the entities, I tried:
curl --location --request GET '<http://localhost:8080/entities/>'
..but it resulted in error 500 with the message that GET op is not supported.witty-keyboard-20400
09/29/2021, 12:21 PMwitty-keyboard-20400
09/29/2021, 12:42 PMurn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)
I understand the initial parts,
urn: just a prefix for this sort of notation.
li: namespace.
dataset: entity type.
Are the remaining parts (urn:li:dataPlatform:foo,bar,PROD)
a notation for an Entity ID or Aspect or what?? What are the 3 arguments?witty-keyboard-20400
09/30/2021, 12:22 PMnamespace com.linkedin.metadata.key
@Aspect = {
"name": "dashboardKey",
}
record DashboardKey {
@Searchable = {
...
}
dashboardTool: string
dashboardId: string
}
The Urn representation of the Key shown above would be:
urn:li:dashboard:(<tool>,<id>)
Question: in the just above line, where is the 3rd component, i.e. dashboard
declared as a type in the Key Aspect dashboardKey
or entity definition itself?witty-keyboard-20400
09/30/2021, 2:00 PMwitty-keyboard-20400
10/01/2021, 11:09 AM{
"auditHeader": null,
"proposedSnapshot": {
"com.linkedin.pegasus2avro.metadata.snapshot.CorpUserSnapshot": {
"urn": "urn:li:corpuser:datahub",
"aspects": [
{
"com.linkedin.pegasus2avro.identity.CorpUserInfo": {
"active": true,
"displayName": {
"string": "Data Hub"
},
"email": "<mailto:datahub@linkedin.com|datahub@linkedin.com>",
"title": {
"string": "CEO"
},
"fullName": {
"string": "Data Hub"
},
}
}
]
}
},
"proposedDelta": null
},
CorpUserKey
(with a field username
) is the Key Aspect for the entity CorpUserSnapshot
(as in the definition of CorpUserSnapshot.pdl). But I don't see any username
field and value in this JSON element.
Could anyone help me understand this anomaly?
@mammoth-bear-12532 @big-carpet-38439witty-keyboard-20400
10/01/2021, 2:02 PM"owners" : [ "urn:li:corpuser:fbar", "urn:li:corpuser:bfoo" ],
in the file DatasetUrn.pdl .
What does it mean to have 2 specific "owners" here inside the schema of DataSetUrn? That too with IDs like "fbar" and "bfoo" ?witty-keyboard-20400
10/04/2021, 12:20 PMbootstrap_mce.json
regarding datasets appearing under browse path "prod":
"com.linkedin.pegasus2avro.metadata.snapshot.DatasetSnapshot": {
"urn": "urn:li:dataset:(urn:li:dataPlatform:hive,SampleHiveDataset,PROD)",
"aspects": [
{
"com.linkedin.pegasus2avro.common.Ownership": {
"owners": [
{
"owner": "urn:li:corpuser:jdoe",
"type": "DATAOWNER",
"source": null
},
{
"owner": "urn:li:corpuser:datahub",
"type": "DATAOWNER",
"source": null
}
],
"lastModified": {
"time": 1581407189000,
"actor": "urn:li:corpuser:jdoe",
"impersonator": null
}
}
},
{
"com.linkedin.pegasus2avro.dataset.UpstreamLineage": {
For the above mentioned Hive dataset, there is no mention of following type of BrowsePaths
the ways Kafka and Hdfs DatasetSnapshot.
"com.linkedin.pegasus2avro.common.BrowsePaths": {
"paths": ["/prod/kafka/Sample..."]
}
How is the path /prod/hive/SampleHiveDataset
appearing on the UI ?
OTOH, I see several ML snapshot mentioning browse paths, but those don't appear on the UI. e.g.
"com.linkedin.pegasus2avro.metadata.snapshot.MLFeatureTableSnapshot": {
"urn": "urn:li:mlFeatureTable:(urn:li:dataPlatform:feast,test_feature_table_no_labels)",
"aspects": [
{
"com.linkedin.pegasus2avro.common.BrowsePaths": {
"paths": ["/feast/test_feature_table_no_labels"]
}
},
What is the criteria on which some are displayed under browse paths on the UI?gray-barista-29387
10/04/2021, 10:52 PMkind-dawn-17532
10/04/2021, 11:00 PMbland-orange-13353
10/06/2021, 1:25 PMclean-monitor-43741
10/07/2021, 4:24 PMERROR: no matching manifest for linux/arm64/v8 in the manifest list entries
.
Can someone point me the direction on how to quickstart using that postgres-setup
instead?victorious-stone-56510
10/09/2021, 5:43 PMfast-winter-10784
10/11/2021, 4:59 PMagreeable-hamburger-38305
10/11/2021, 5:29 PMwitty-keyboard-20400
10/12/2021, 5:34 AMwitty-keyboard-20400
10/12/2021, 12:38 PMcurl '<http://localhost:8080/entities/urn:li:dataset:(urn:li:dataPlatform:cg,kv_entity,PROD)>'
I see only this much:
{
"value": {
"com.linkedin.metadata.snapshot.DatasetSnapshot": {
"urn": "urn:li:dataset:(urn:li:dataPlatform:cg,kv_entity,PROD)",
"aspects": [
{
"com.linkedin.metadata.key.DatasetKey": {
"origin": "PROD",
"name": "kv_entity",
"platform": "urn:li:dataPlatform:cg"
}
}
]
}
}
}
However, when I created this cg dataset snapshot, I had this structure:
"auditHeader": null,
"proposedSnapshot": {
"com.linkedin.pegasus2avro.metadata.snapshot.DatasetSnapshot": {
"urn": "urn:li:dataset:(urn:li:dataPlatform:cg,kv_entity,PROD)",
"aspects": [
{
"com.linkedin.pegasus2avro.common.BrowsePaths": {
"paths": ["/prod/cg/kv_entity"]
}
},
{
"com.linkedin.pegasus2avro.dataset.DatasetProperties": {
"description": "kv_entity collections",
"tags": [],
"customProperties": {
"db_cluster_setup_confluence_link": "https://<....wiki link here...>",
"doc_author": "<mailto:abc.aaa@example.com|abc.aaa@example.com>"
}
}
},
{
"com.linkedin.pegasus2avro.common.Ownership": {
"owners": [
{
"owner": "urn:li:corpuser:cg_owner",
"type": "DATAOWNER",
"source": null
}
],
"lastModified": {
"time": 1633345222224,
"actor": "urn:li:corpuser:cg_owner",
"impersonator": null
}
}
},
{
"com.linkedin.pegasus2avro.common.InstitutionalMemory": {
"elements": [
{
"url": "<https://wiki> link to cg",
"description": "Business Requirements for CG",
"createStamp": {
"time": 1581407189000,
"actor": "urn:li:corpuser:cg_owner",
"impersonator": null
}
}
]
}
},
{
"com.linkedin.pegasus2avro.schema.SchemaMetadata": {
"schemaName": "kv_entity",
"platform": "urn:li:dataPlatform:cg",
"version": 0,
"created": {
"time": 1581407189000,
"actor": "urn:li:corpuser:cg_dev_01",
"impersonator": null
},
"lastModified": {
"time": 1581407189000,
"actor": "urn:li:corpuser:cg_dev_02",
"impersonator": null
},
"deleted": null,
"dataset": null,
"cluster": null,
"hash": "",
"platformSchema": {
"com.linkedin.pegasus2avro.schema.KafkaSchema": {
"documentSchema": "{\"type\":\"record\",\"name\":\"KVEntityCodes\",\"namespace\":\"com.linkedin.dataset\",\"doc\":\"KV Entity codes\",\"fields\":[{\"name\":\"tenant_id\",\"type\":[\"number\"]},....]}"
}
},
"fields": [
{
"fieldPath": "[version=2.0].[type=int].tenant_id",
"jsonPath": null,
"nullable": false,
"description": {
"string": "Tenant Id originated from .."
},
"type": {
"type": {
"com.linkedin.pegasus2avro.schema.NumberType": {}
}
},
"nativeDataType": "int",
"globalTags": {
"tags": [{ "tag": "urn:li:tag:NeedsDocumentation" }]
},
"recursive": false
},
{
...
}
]
}
}
]
}
}
}