Hello guys, I've set the Glossary Terms via file i...
# getting-started
s
Hello guys, I've set the Glossary Terms via file ingestion. Then, I had to change the name of some of the glossary terms and, instead of reingesting the glossaries via the yaml file, I've renamed it directly via the UI. However, the URN stayed the same name as before. How should I correct this ? Thanks !
b
URN is permanent and not possible to change, unless you delete it. which is why the newer approach is to use a pseudo random number as the URN and set a display name. (if you created it in UI, the URN is some random UUID)
s
And how can I do that? Is there a way o setting this UUID paradigm? Thank you so much !
b
probably not with the existing business glossary source (which is generating a URN based on the term's name) but you can create terms programmatically using the emitter, and in doing so, specify a URN for the term
s
Okay, so I would ingest the glossaries using a UUID instead of a name. But to change the UUID into a name on the UI after ingesting, you recommend me to rename manually?
b
something like
Copy code
from datahub.emitter.mcp import MetadataChangeProposalWrapper

from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
from datahub.metadata.schema_classes import (
    ChangeTypeClass,
    GlossaryNodeInfoClass,
    GlossaryTermInfoClass,
)

termUrn = f"urn:li:glossaryTerm:XXXXX"
mcp = MetadataChangeProposalWrapper(
        entityType="glossaryTerm",
        changeType=ChangeTypeClass.UPSERT,
        entityUrn=termUrn,
        aspectName="glossaryTermInfo",
        aspect=GlossaryTermInfoClass(
            definition="DESCRIPTION",
            name="NAME HERE",
            parentNode=nodeUrn,
            termSource="INTERNAL",
    ),
)
graph.emit(mcp)
where name is the display name
s
Thank you so much for the accurate answers! Now I understood the idea.
For this approach, is there a way of keeping the definition or I have to reset as well like this ? Because my idea was to ingest everything via yaml using the UUIDs you've suggested, but keeping the same definition as before. Is there a way of doing so ? @better-orange-49102
b
are you saying you are still using the file ingestion method, but instead of having proper term names in the glossary yaml, you assigned UUID?
s
Yes! I've ingested all the Glossary Terms via file ingestion, changing the meaningful names that I had before by UUID's. Then I used your script to change the name of the Glossary Terms. However, I would like to keep the same definition as before. If I run your example script, it will replace as well the definition. Is there a way to avoid this ?
b
something like (refer to examples folder for similar examples)
Copy code
from datahub.emitter.mcp import MetadataChangeProposalWrapper

from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
from datahub.metadata.schema_classes import (
    ChangeTypeClass,
     GlossaryTermInfoClass,
)

termUrn = f"urn:li:glossaryTerm:XXXXX"
term= graph.get_aspect_v2(
  entity_urn=termUrn ,
  aspect = "glossaryTermInfo",
  aspect_type=GlossaryTermInfoClass
)
term.name="YYYY"
mcp = MetadataChangeProposalWrapper(
        entityType="glossaryTerm",
        changeType=ChangeTypeClass.UPSERT,
        entityUrn=termUrn,
        aspectName="glossaryTermInfo",
        aspect=term
    ),
)
graph.emit(mcp)
you query GMS for the existing entity using get_aspect_v2 function, then update the object, then emit it back to GMS
s
Nice, you are very well prepared. Bravo. I am still trying to understand the API and everything that can be done with that. But your script work perfectly. Thank you so much!