Just checking to make sure I’ve understood this co...
# ingestion
h
Just checking to make sure I’ve understood this correctly: we cant create tags via MCE’s becuase the builder is not listed here: https://github.com/linkedin/datahub/blob/master/metadata-dao-impl/restli-dao/src/main/java/com/linkedin/metadata/dao/RequestBuilders.java
@green-football-43791 I threw together this PR: https://github.com/linkedin/datahub/pull/2320 Let me know if i’m missing something here.
For context, the problem I was seeing is an error like this when trying to create a new tag:
Copy code
java.lang.IllegalArgumentException: com.linkedin.common.urn.TagUrn is not a supported URN type
Also, in the current implementation, ingesting tags on datasets seems to overwrite any tags set through the UI. Am I missing something?
g
Hey fredrik- the pr looks good 👍 looks like I had missed the kafka ingest case- thanks for catching!
regarding overwriting, you're correct- on the entity level at the moment we only have 1 tag aspect. given the current state, the approach I'd recommend is fetching the tags of the entity from datahub on writes, appending your new tag to the existing array, and writing back the newly constructed array. However, there are some solutions: 1. create a second tag aspect "EditableGlobalTags". Edits from the UI will update this aspect while edits from ingestion will update GlobalTags. This is the same pattern we used for adding tags to schemas so tags wouldn't be overwritten any time the schema was re-ingested. 2. Develop a more general solution to the problem of multiple writers. As datahub develops, I foresee an increase in writes coming from the UI. Ideally we wouldn't need to create an
EditableXYZ
clone of every aspect. Perhaps the solution here is to add a new flag to our metadata objects of whether they are the ui-editable version or not. In addition, if we could come up with a smart, more generic way of merging the editable and non-editable aspects, this would save us even more headache.
3. A third option could be exposing some APIs akin to
addTag(urn)
and
removeTag(urn)
- this could free any given writer from needing to understand the bigger picture about an entity
h
Thanks for the review. Hopefully we can get it merged soon. Thanks for the explanation! This is by no means a blocker for us at the moment, and the work around your proposed should work well for us, for the time being. Intuitively the third options feels like the cleanest solution.
c
@ancient-egg-70238
a
Hi everyone, Is there any update on this issue? We also ran into it in v0.8.21. Thanks! https://datahubspace.slack.com/archives/CUMUWQU66/p1617193006052500?thread_ts=1617184836.049200&cid=CUMUWQU66