Hi datahub team, new to datahub here and want to s...
# getting-started
l
Hi datahub team, new to datahub here and want to store a dataset name that is distinct from urn — i.e. in my use case the URN is a path on s3, but goal is to have dataset name in datahub be a nice human-readable name. What I tried so far (in Python):
Copy code
aspects = [
        DatasetPropertiesClass(
            name=nice_human_readable_name,
            customProperties=properties,
            description=description,
            externalUrl=url
        ),
    ]
or alternatively:
Copy code
aspects = [
        DatasetPropertiesClass(
            qualifiedName=nice_human_readable_name,
            customProperties=properties,
            description=description,
            externalUrl=url
        ),
    ]
The mcps that are generated have proper format:
connection.submit_change_proposals
Copy code
[MetadataChangeProposalWrapper(entityType='dataset', changeType='UPSERT', entityUrn='urn:li:dataset:(urn:li:dataPlatform:s3,test_s3_dataset3567c322-fd92-4417-98f0-90a66e32101b,PROD)', entityKeyAspect=None, auditHeader=None, aspectName='ownership', aspect=OwnershipClass({'owners': [OwnerClass({'owner': 'urn:li:corpuser:etl', 'type': 'DATAOWNER', 'source': OwnershipSourceClass({'type': 'SERVICE', 'url': None})})], 'lastModified': AuditStampClass({'time': 1661399154, 'actor': 'urn:li:corpuser:etl', 'impersonator': None, 'message': None})}), systemMetadata=None), MetadataChangeProposalWrapper(entityType='dataset', changeType='UPSERT', entityUrn='urn:li:dataset:(urn:li:dataPlatform:s3,test_s3_dataset3567c322-fd92-4417-98f0-90a66e32101b,PROD)', entityKeyAspect=None, auditHeader=None, aspectName='datasetProperties', aspect=DatasetPropertiesClass({'customProperties': {'here3567c322-fd92-4417-98f0-90a66e32101b': 'are some fake properties', 'that_are': 'used_for_testing'}, 'externalUrl': None, 'name': 'test_s3_dataset3567c322-fd92-4417-98f0-90a66e32101b', 'qualifiedName': None, 'description': 'This is a fake description of a dataset', 'uri': None, 'tags': []}), systemMetadata=None), MetadataChangeProposalWrapper(entityType='dataset', changeType='UPSERT', entityUrn='urn:li:dataset:(urn:li:dataPlatform:s3,test_s3_dataset3567c322-fd92-4417-98f0-90a66e32101b,PROD)', entityKeyAspect=None, auditHeader=None, aspectName='institutionalMemory', aspect=InstitutionalMemoryClass({'elements': [InstitutionalMemoryMetadataClass({'url': '<https://www.google.com/>', 'description': 'link3567c322-fd92-4417-98f0-90a66e32101b', 'createStamp': AuditStampClass({'time': 1661399154, 'actor': 'urn:li:corpuser:etl', 'impersonator': None, 'message': None})})]}), systemMetadata=None), MetadataChangeProposalWrapper(entityType='dataset', changeType='UPSERT', entityUrn='urn:li:dataset:(urn:li:dataPlatform:s3,test_s3_dataset3567c322-fd92-4417-98f0-90a66e32101b,PROD)', entityKeyAspect=None, auditHeader=None, aspectName='globalTags', aspect=GlobalTagsClass({'tags': [TagAssociationClass({'tag': 'urn:li:tag:tag13567c322-fd92-4417-98f0-90a66e32101b', 'context': None}), TagAssociationClass({'tag': 'urn:li:tag:tag_23567c322-fd92-4417-98f0-90a66e32101b', 'context': None})]}), systemMetadata=None)]
but then this rather crytpic error message (see attached screenshot). Any advise appreciated! Thanks!
g
This seems like a server-client version mismatch - what versions of the cli and datahub server are you using?
Also, could you try calling
MetadataChangeProposalWrapper.validate()
to verify that the formatting is correct
l
Thanks so much for the tips, I’ll check in a bit and respond here
w
I've made a thread about that in another channel but never got any response: https://datahubspace.slack.com/archives/CV2UVAPPG/p1659528050363019