Hello! Anybody ingested tags via python framework?...
# troubleshoot
h
Hello! Anybody ingested tags via python framework? I'm getting this error:
Copy code
Caused by: java.net.URISyntaxException: Urn entity type should be 'dataset'.: urn:li:dataset:(urn:li:dataPlatform:exasol,main.dds.h_car,PROD)
urn is correct for sure, used it for other examples, like profiling. Any help would be much appreciated
Copy code
emitter = DatahubRestEmitter("<http://localhost:8080>")

entity_urn = 'urn:li:dataset:(urn:li:dataPlatform:exasol,main.dds.h_car,PROD)'

mce = MetadataChangeEventClass(
    proposedSnapshot=TagSnapshotClass(
        urn = entity_urn,
        aspects=[
            TagKeyClass('test_tag')
        ]
    )
)

emitter.emit_mce(mce)
e
Hey! Are you trying to add a tag to a dataset?
To create a tag, you will have to use a tag urn like “urnlitag:Engineering”
then second step attaches the tag to the dataset
note that the GlobalTags aspect is on the dataset, so you need to create a dataset snapshot with the GlobalTags aspect, where the tags attribute is set to the urns of the tags you are trying to attach
h
Ok, I see, my bad 😅 Will try that soon, thank you!
e
No worries! Let us know if you run into any other issues setting this up!
h
Great, it worked! But I've got another question.
Here you can see that tag is attached to dataset:
But if I click on it, it says that it is "applied to no entities":
is that correct? dataset is not an entity?
b
That shouldn't be the case, clicking on a dataset tags should return a count of datasets with the tag and clicking on it should return you a list of those datasets..
m
@handsome-belgium-11927 there is a small bug in the UI where it searches for datasets attached to a tag using the “name” of the tag as defined in the TagInfo aspect. (Instead of the name as defined in the TagKey)
When creating tags via the UI, tag names are added to the info aspect as well, so this bug doesn’t appear.
To workaround this issue for now, you will have to also emit a TagInfo aspect when you are creating the tag (and set the name to exactly the same name as the tag key). It actually allows you to set a description as well, which is nice anyway.
@green-football-43791 /cc
1
h
Ok, great, I'll try the workaround 👌
Looks like TagInfo is not on python ingestion framework yet, so I guess I'll try this workaround a little later 😞
b
are u referring to creating a fresh tag? there are sample tags to ingest
Copy code
{
  "auditHeader": null,
  "proposedSnapshot": {
    "com.linkedin.pegasus2avro.metadata.snapshot.TagSnapshot": {
      "urn": "urn:li:tag:Legacy",
      "aspects": [
        {
          "com.linkedin.pegasus2avro.tag.TagProperties": {
            "name": "Legacy",
            "description": "Indicates the dataset is no longer supported"
          }
        },
        {
          "com.linkedin.pegasus2avro.common.Ownership": {
            "owners": [
              {
                "owner": "urn:li:corpuser:jdoe",
                "type": "DATAOWNER",
                "source": null
              }
            ],
            "lastModified": {
              "time": 1581407189000,
              "actor": "urn:li:corpuser:jdoe",
              "impersonator": null
            }
          }
        }
      ]
    }
  },
  "proposedDelta": null
}
h
I don't know what happened, but now section 'Applied to X Datasets' is working correctly even without the workaround
g
Hey @handsome-belgium-11927! That would be because I put up a fix for this last Friday- https://github.com/linkedin/datahub/pull/3223
Name in TagProperties is now ignored, the only name referenced is the name in the tag's urn.
h
That's great, thank you, @green-football-43791! 🙌
😄 1