I am a bit confused as to whether the GMS API shou...
# getting-started
m
I am a bit confused as to whether the GMS API should expose the aspect model or not. I see for instance that there are two Ownerships, one that seems to be a generic aspect, assignable to any URN and the other being an ownership dedicated to dataset ownership. There is a lot of repetition there, for instance between the "generic" OwnershipType and the dataset-specific "OwnershipCategory". I initially assumed that this was because the aspect model should not be directly exposed through the GMS but rather abstracted away into payloads designed specifically for the GMS API. However, then I noticed that the GMS's dataset ownership resource actually exposes the generic Ownership aspect, and not the dedicated dataset ownership payload from the API module. So now, I don't fully understand if we should abstract away the aspect-driven model in the gms or not.
b
I can see two great questions there. Will try to document these formally later but here are some quick answers, 1. Should an aspect be exposed via the GMS API (e.g.
/datasets/ownership
) or simply be included as part of the top level endpoint (e.g.
/datasets
)? 2. Difference between
OwnershipType
(https://github.com/linkedin/datahub/blob/master/metadata-models/src/main/pegasus/com/linkedin/common/OwnershipType.pdscO) vs
OwnershipCategory
Answers: 1. Generally including the aspects as part of the top-level endpoint is enough for the r/o use cases (as the top-level endpoint is r/o). However, if you need to update a specific aspect via the API, you'll need an aspect-specific sub-endpoint. The aspect-specific sub-endpoint also give you the ability to retrieve a older versions of the aspect and additional audit-related metadata 2. I'm unable to find
OwnershipCategory
model in the repo. We did have that in our internal repo as a legacy model, but for OSS
com.linkedin.common.Ownership
is the common ownership model that should be used for all entities, including datasets.
m
Hi, thanks for your swift response. Related to 1, does that mean that we'll see significant overlap between what is "flattened" into the top level endpoint and what is available as the "latest aspect" version in the sub resources? As for 2, I am currently on the master branch. The common ownership is indeed in the model repo. The specific one for dataset exists directly in the API module. (https://github.com/linkedin/datahub/blob/master/gms/api/src/main/pegasus/com/linkedin/dataset/Ownership.pdsc) I didn't find any uses of it though. Either way, your answer has cleared things up a lot. Thanks!
b
1. Yes, the 2 models should be identical. Seems like we still have some legacy code for
/datasets
that we'll migrate off soon. A cleaner example would be https://github.com/linkedin/datahub/blob/master/gms/api/src/main/pegasus/com/linkedin/identity/CorpUser.pdsc#L20 vs https://github.com/linkedin/datahub/blob/master/gms/impl/src/main/java/com/linkedin/identity/rest/resources/CorpUsersEditableInfoResource.java#L17 2. Argh that's definitely legacy stuff. Will drop those unused models soon.