Hi All , We are trying to load data from our tabl...
# integrate-tableau-datahub
b
Hi All , We are trying to load data from our tableau server into our deployed datahub. But we are getting an error
Failed to extract some records due to: source produced an invalid metadata work unit:
Masked Recipe Details and Error Details can be find in reply. Any help would be much appreciated.
Tableau server version :
2021.2.4
Datahub cli version :
acryl-datahub==0.8.33.1
Recipe :
Copy code
source:
  type: tableau
  config:
    # Coordinates
    connect_uri: ******
    site: ''
    projects: ["******", "******"]

    # Credentials
    username: ******
    password: ******

    # Options
    ingest_tags: True
    ingest_owner: True

sink:
  type: file
  config:
    filename: ./tableau_file.json
Error details :
Copy code
Failed to extract some records due to: source produced an invalid metadata work unit: 

MetadataChangeEventClass({'auditHeader': None, 'proposedSnapshot': ChartSnapshotClass({'urn': 'urn:li:chart:(tableau,********)', 'aspects': [ChartInfoClass({'customProperties': {None: ''}, 'externalUrl': '********', 'title': 'Registrations', 'description': '', 'lastModified': ChangeAuditStampsClass({'created': AuditStampClass({'time': 1598422143000, 'actor': 'urn:li:corpuser:********', 'impersonator': None}), 'lastModified': AuditStampClass({'time': 1611067789000, 'actor': 'urn:li:corpuser::********',', 'impersonator': None}), 'deleted': None}), 'chartUrl': None, 'inputs': ['urn:li:dataset:(urn:li:dataPlatform:tableau,********,PROD)', 'urn:li:dataset:(urn:li:dataPlatform:tableau,********,PROD)'], 'type': None, 'access': None, 'lastRefreshed': None}), BrowsePathsClass({'paths': ['********']}), OwnershipClass({'owners': [OwnerClass({'owner': 'urn:li:corpuser:********', 'type': 'DATAOWNER', 'source': None})], 'lastModified': AuditStampClass({'time': 0, 'actor': 'urn:li:corpuser:unknown', 'impersonator': None})})]}), 'proposedDelta': None, 'systemMetadata': SystemMetadataClass({'lastObserved': 1651049557282, 'runId': 'tableau-2022_04_27-14_22_01', 'registryName': None, 'registryVersion': None, 'properties': None})})
h
I believe this is due to this section
'customProperties': {None: ''},
arising due to sheet's datasourceFields having single entry having
null
name (odd but possible in tableau metadata api). We should be able to fix it by adding a check somewhere here to omit such fields. Is there any chance, you'll be able to confirm this by calling tableau metadata api ? Query Body:
Copy code
{
  sheets(filter: {id: "guid from chart urn"}) {
    id
    name
    datasourceFields {
      id
      name
      description
    }
  }
}
b
@hundreds-photographer-13496 Thanks for the reply. I checked the same via metadata api , no field have name as null
Copy code
{
  "data": {
    "sheets": [
      {
        "id": "****",
        "name": "Registrations",
        "datasourceFields": [
          {
            "id": ""****",",
            "name": "Report Date",
            "description": null
          },
          {
            "id": ""****",",
            "name": "Reg Complete",
            "description": null
          },
          {
            "id": ""****",",
            "name": "% Reg Completions",
            "description": null
          },
          {
            "id": ""****",",
            "name": "Submit Mob",
            "description": null
          }
        ]
      }
    ]
  }
}
Also, an extra observation there are other failed cases as well, where this property is not present
'customProperties': {None: ''
but have None in path
'aspects': [BrowsePathsClass({'paths': ['/prod/tableau/None/None/None.eb1fa513-25c1-3164-6152-a84523442dfe']})
h
None in browse path is fixed in this PR . Setting browse path is skipped if any parent is None. This is yet to be merged. Quite strange why customProperties would show up this way. Does tableau ingestion run from less priviledged user who does not have access for names ?
Do you see any warnings with NODE_LIMIT_EXCEEDED in ingestion logs ?
b
No i didn't get this error on the terminal, after running the datahub cli.
@hundreds-photographer-13496 Thanks, isue was with the user permissions..
i tried with the admin user and issue resolved.
h
wow, thanks a lot for sharing. This is really helpful. Meanwhile we will work on fixing the behavior for customProperties for less priviledged user.
🙌 1
m
@broad-tomato-45373 that’s right you need to run with Site Admin Creator privilege at least otherwise the info of the field will be None.
b
yes, just got to know about the same last day itself. Thanks @modern-artist-55754