Hi team, I found that the lineage is not automatic...
# getting-started
e
Hi team, I found that the lineage is not automatically generated between different sources, such as Snowflake and PowerBI. Do we have any option to create the lineage automatically? Thanks
1
m
Which version of datahub do you use? I think this has been fixed in the more recent version of datahub.
e
Hey Steve, I deployed on docker locally with the quickstart, I tried to find the version in docker-compose😂
d
Hi, the version should be in the ingestion log - for example, if you’re using UI ingestion you could find something like this
image.png
m
You probably want to run quickstart with
--version "v0.10.3"
to pin to a stable version
e
Thanks guys, I think I am using v0.10.3, but cannot automate the lineage between Snowflake and PBI
m
Can you show your recipe?
Btw, your powerbi needs have Admin API previlege
e
This is my recipe
m
I think you have not had the proper privilege
e
image.png
m
Enhanced api?
e
I have enable all three tabs under admin API setting and the first one is specific group and the other two are entire organization
m
These?
e
image.png
m
Hm ok
There is no lineage at all? Or just some?
e
I just ingested one report and there is not lineage between two data source
I got this lineage, but I am expecting a lineage from snowflake
m
It should have lineage to snowflake, i am surprise that i dont see the powerbi dataset
e
😂
This is the pbi workspace
m
Do you see the DHtest_airport dataset at all in datahub?
e
nah, I am not able to see that in Datahub
And this should be the snowflake table feeding the report
image.png
m
@gentle-hamburger-31302 can you help Joanna?
g
@eager-monitor-4683 Could you please post the output of below commands • datahub get urn --urn <PowerBI Report URN (report that you have shown in image> • datahub get urn --urn <PowerBI Chart URN (chart that you have shown in image)> • datahub get urn --urn <Snowflake dataset URN (dataset show in image)>
e
image.png
Hey @gentle-hamburger-31302, the result is like above
g
@eager-monitor-4683 I think first you need to fix your terminal The output should looks something like below
Copy code
{
  "browsePaths": {
    "paths": [
      "/powerbi/acryl-datahub"
    ]
  },
  "chartInfo": {
    "customProperties": {
      "createdFrom": "Report",
      "datasetId": "83d39dd2-6e41-43fc-84bd-1b28fd3755be",
      "datasetWebUrl": "<https://app.powerbi.com/groups/2b7bf245-9398-46ee-a2ae-77034988c1f4/datasets/83d39dd2-6e41-43fc-84bd-1b28fd3755be/details>",
      "reportId": "ecb1b959-74e3-437f-a0e8-05a481571dfc"
    },
    "description": "SN DirectQuery Sales Report",
    "externalUrl": "<https://app.powerbi.com/groups/2b7bf245-9398-46ee-a2ae-77034988c1f4/reports/ecb1b959-74e3-437f-a0e8-05a481571dfc>",
    "inputs": [
      {
        "string": "urn:li:dataset:(urn:li:dataPlatform:powerbi,sn_direct_query_dataset.SALES,PROD)"
      }
    ],
    "lastModified": {
      "created": {
        "actor": "urn:li:corpuser:unknown",
        "time": 0
      },
      "lastModified": {
        "actor": "urn:li:corpuser:unknown",
        "time": 0
      }
    },
    "title": "SN DirectQuery Sales Report"
  },
  "chartKey": {
    "chartId": "<http://powerbi.linkedin.com/charts/23077b0a-f0df-452b-8319-9d37ec6232cf|powerbi.linkedin.com/charts/23077b0a-f0df-452b-8319-9d37ec6232cf>",
    "dashboardTool": "powerbi"
  },
  "container": {
    "container": "urn:li:container:9b7b22e5f74b9bf0cc0690398769b01d"
  },
  "dataPlatformInstance": {
    "platform": "urn:li:dataPlatform:powerbi"
  },
  "status": {
    "removed": false
  }
}
put double quotes around urn
e
image.png
image.png
Copy code
{
  "browsePaths": {
    "paths": [
      "/prod/snowflake/airport_test/public"
    ]
  },
  "container": {
    "container": "urn:li:container:16579316037347bf29c2388ada137705"
  },
  "dataPlatformInstance": {
    "platform": "urn:li:dataPlatform:snowflake"
  },
  "datasetKey": {
    "name": "airport_test.public.airport_table",
    "origin": "PROD",
    "platform": "urn:li:dataPlatform:snowflake"
  },
  "datasetProperties": {
    "created": {
      "time": 1665982323479
    },
    "customProperties": {},
    "externalUrl": "<https://app.snowflake.com/ap-southeast-2/servian/#/data/databases/AIRPORT_TEST/schemas/PUBLIC/table/AIRPORT_TABLE/>",
    "lastModified": {
      "time": 1665982934191
    },
    "name": "AIRPORT_TABLE",
    "qualifiedName": "airport_test.public.airport_table",
    "tags": []
  },
  "schemaMetadata": {
    "created": {
      "actor": "urn:li:corpuser:unknown",
      "time": 0
    },
    "fields": [
      {
        "fieldPath": "airport_code",
        "isPartOfKey": false,
        "nativeDataType": "NUMBER(38,0)",
        "nullable": true,
        "recursive": false,
        "type": {
          "type": {
            "com.linkedin.schema.NumberType": {}
          }
        }
      },
      {
        "fieldPath": "airport_description",
        "isPartOfKey": false,
        "nativeDataType": "VARCHAR(16777216)",
        "nullable": true,
        "recursive": false,
        "type": {
          "type": {
            "com.linkedin.schema.StringType": {}
          }
        }
      }
    ],
    "hash": "",
    "lastModified": {
      "actor": "urn:li:corpuser:unknown",
      "time": 0
    },
    "platform": "urn:li:dataPlatform:snowflake",
    "platformSchema": {
      "com.linkedin.schema.MySqlDDL": {
        "tableSchema": ""
      }
    },
    "schemaName": "airport_test.public.airport_table",
    "version": 0
  },
  "status": {
    "removed": false
  },
  "subTypes": {
    "typeNames": [
      "Table"
    ]
  }
}
The last one is from Snowflake urn
g
@eager-monitor-4683 The chart inputs is empty, looks like dataset not coming in request. Could you please share the debug log, You can generate debug log as below
Copy code
datahub --debug ingest run -c "<recipe path>" &> /tmp/powerbi-ingest.log
post /tmp/powerbi-ingest.log as attachement
e
Hey Sorry for the late, please refer to below attachment for the log
m
Can you run
pip install 'acryl-datahub[powerbi]
and rerun the ingestion? Your run has error
e
yeah, I did run that, but seems getting the same error
powerbi-ingest.log
should I restart local docker and try?
m
Where do you run datahub ingest?
e
In the UI
m
Where do you run the pip install?
e
ummm, I run it in my local command line
I just tried again an get this log, and I noticed that [2023-06-09 053921,023] WARNING {datahub.ingestion.source.powerbi.rest_api_wrapper.powerbi_api:246} - Dataset lineage can not be ingestion because this user does not have access to the PowerBI Admin API. I checked around the setting and this pbi user is set as the owner of the service principle.
Just this user is using trial pro license in pbi, will it be the issue for the access?
g
Hi @eager-monitor-4683 Please check quick ingestion guide of powerbi (https://datahubproject.io/docs/quick-ingestion-guides/powerbi/overview), steps mentioned in setup guide section need to be followed
e
I did check this page and also follow the steps, I just review all the settings and its all the same. From the log its saying Dataset lineage can not be ingestion because this user does not have access to the PowerBI Admin API.
g
The configured client-credential i.e client id and client-secret doesn't have access
One of the reason if misconfiguration, like you the Auzre AD App which is added in security group as member is different than what is configured in recipe
Could you please double check your Azure AD App -> Security Group -> Security Group Added in PowerBI
e
Is it ok to jump in a quick call?
This is what i have in the security group
g
Let me call you on huddle
e
Thanks
Thank you so much for helping me solving the issue @gentle-hamburger-31302 @modern-artist-55754