Hi guys, I ingested metadata from Snowflake and Ta...
# integrate-tableau-datahub
b
Hi guys, I ingested metadata from Snowflake and Tableau, running on most recent cli (0.10.5). What is already amazing is that lineage between Snowflake tables/views and Tableau data source objects was picked up. However, there is no column level lineage between Snowflake and Tableau. Is this expected behavior? I'm asking because one of the config options is:
Copy code
extract_column_level_lineage (boolean, default: true):
When enabled, extracts column-level lineage from Tableau Datasources
Does this mean lineage between tableau data source and tableau chart, and not between external table/view and tableau data source? If so, are there any future plans to add column lineage between Snowflake and Tableau?
m
I believe that is the case. Only column lineage between tableau ds and tableau chart.
h
I believe, column level lineage is automatically extracted between tableau datasource and snowflake dataset as well. For custom sql datasources, this features was recently added in 0.10.5.
b
I ran ingestion using CLI 0.10.5.4 and it didn’t work, did anyone else have success with this?
h
Oh, that's strange indeed. For non-custom sql datasources, DataHub in fact relies on tableau graphql (datasource->fields->upstreamFields). I wonder if this is due to missing information in tableau api or due to the schema field urn casing issue. Can you retrieve
upstreamLineage
aspect of concerned tableau datasource, using its urn, to confirm if fineGrainedLineages are present inside it ? this should help on how to retrieve aspect - https://datahubproject.io/docs/api/restli/restli-overview/#retrieving-entity-aspects cc: @famous-waitress-64616
b
I ran the graphql query, here is the response. According to it, lineage should exist?
Copy code
"fineGrainedLineages":
                [
                    {
                        "downstreamType": "FIELD",
                        "downstreams":
                        [
                            "urn:li:schemaField:(urn:li:dataset:(urn:li:dataPlatform:tableau,c93c3d8f-f630-ee5e-3c2a-243f50f96671,PROD),Fav Game)"
                        ],
                        "confidenceScore": 1.0,
                        "upstreamType": "FIELD_SET",
                        "upstreams":
                        [
                            "urn:li:schemaField:(urn:li:dataset:(urn:li:dataPlatform:snowflake,db.schema.view_name,PROD),FAV_GAME)"
                        ]
                    },
...
There are more columns, this is just the first one. But it looks like info is there. Why is it now showing in UI?
@hundreds-photographer-13496 Is this a bug, what do you think?
h
I'm not sure yet. This is definitely unexpected. To know if this is bug or sql parsing limitation, could you please answer the questions below - 1. is the concerned datasource / downstream using a custom sql or is it simple datasource that tableau can figure out lineage for. 2. Have you set extract_lineage_from_unsupported_custom_sql_queries: True in your tableau recipe ? (not saying that you should set it, just figuring out what happened in your case) 3. your schema field urn looks like
"urn:li:schemaField:(urn:li:dataset:(urn:li:dataPlatform:snowflake,db.schema.view_name,PROD),FAV_GAME)"
. is the corresponding snowflake dataset with urn
urn:li:dataset:(urn:li:dataPlatform:snowflake,db.schema.view_name,PROD)
already ingested in your datahub instance ? You can find it out by hitting this urn on your datahub
https://<base url>/dataset/<dataset-urn>
4. I answer to 3 is yes. then does the dataset have FAV_GAME column already ? does it show up in UI in uppercase or in lowercase ?
b
@hundreds-photographer-13496 1. Tableau side (downstream of Snowflake view) is a 'simple datasource', its published data source, no custom sql 2. Didn't set that in the recipe, so the default value was used 3. Snowflake dataset is already ingested, I can opet it by c/p of urn into url like you sent. 4. It has the column, however it shows as lowercase in UI
Do you think case of the column name is the issue here?
m
I think that is the possibility, i think different casing means different urn. Not sure if this behavior has changed compared to the old one
b
I just checked via graphql directly on the SF table and there in the fine grained lineage (from another upstream SF table into it), the lower case column 'fav_game' is shown
In general, all columns ingested from Snowflake seem to be lower case, why is this not the behavior that tableau looks for when creating the bridge lineage?
f
cc @gray-shoe-75895
g