https://datahubproject.io logo
Join Slack
Powered by
# integrate-tableau-datahub
  • m

    magnificent-lock-58916

    02/14/2023, 10:46 AM
    Hello, is there any way to automatically build Lineage between Tableau CustomSQLs and database tables (clickhouse for example) that are the source to this CustomSQL? I can see that DataHub automatically builds lineage between clickhouse views and tables/dictionaries for example, perhaps based on view’s definition. So I do wonder if Tableau CustomSQL Lineage could be built automatically too, using it’s definition
    c
    • 2
    • 1
  • l

    lively-jackal-83760

    02/22/2023, 9:59 AM
    *Hi guys We started using Tableau ingestion and have question. Version of server and client 0.10.0 I set ingest_tables_external=True and see what I expected Dashboard -> Chart -> Embedded DS -> Published DS -> Vertica tables each vertica table looks good, with the correct urn like urnlidataset:(urnlidataPlatform:vertica,myDb.mySchema.myTable,PROD) , but when I open dataset details I expect to see this table in Datasrts under vertica platform. But see it in prod -> tableau -> Default -> mySchema I guess it because of this code in tableau.py
    Copy code
    table_path = None
                if project and datasource_name:
                    table_path = (
                        f"{project.replace('/', REPLACE_SLASH_CHAR)}/{datasource_name}"
                    )
    is it possible to change this behaviour somehow? I want to ingest vertica source as usual and then ingest Tableau and make links between ingested vertica tables and Tableau charts
    a
    h
    • 3
    • 4
  • o

    orange-intern-2172

    03/13/2023, 10:03 AM
    Does anyone here know how to stop the Tableau tokens from expiring? I need something that is permanent... I'm using the Personal access token under my profile.
    • 1
    • 3
  • a

    acoustic-quill-54426

    03/13/2023, 4:11 PM
    X posting for visibility. We would be happy to go over it with you guys if needed. For context, this affects to 552/1141 custom SQL tables at my company. We have a script that uses the tableau metadata catalog API to fetch the queries, parses them and emits the lineage using a datahub mutation. It would be great if we could have this integrated with the ingestion !
    👍 3
  • a

    acoustic-quill-54426

    03/13/2023, 5:22 PM
    I think we introduced a breaking change here =>
    Validation error of type FieldUndefined: Field 'projectLuid' in type 'Workbook' is undefined @ 'workbooksConnection/nodes/projectLuid'
    that affects all versions previous to 2022.3 https://help.tableau.com/current/api/metadata_api/en-us/docs/meta_api_release_notes.html
    • 1
    • 3
  • l

    lively-jackal-83760

    05/24/2023, 11:11 AM
    Hi guys I noticed some weird behavior. I used Tableau source with stateful_ingestion option. The first time - it's ok, all necessary entities were created. On the second time, my Tableau server was down and datahub's client couldn't connect. But the ingestion wasn't stop, it wrote some HTTP connect error log and then continue to do the magic with stateful_ingestion. Without connection, we don't have data at all and the client decides to remove all my Tableau data from first time running 🙂 I guess it isn't expected behavior. What do you thinks? or maybe we have some option to prevent such behavior?
    h
    • 2
    • 1
  • h

    happy-belgium-57206

    06/06/2023, 2:08 PM
    Hi team - I have Tableau metadata available in file system (csv format). Is there an API which will consume this metadata and update DataHub backend storage (MySQL, ES, Kafka)? Any other approach to achieve above stated goal Thanks Dharmendra
    m
    g
    • 3
    • 3
  • m

    miniature-painter-94073

    07/10/2023, 1:10 PM
    Hi ALl, Quick Q - Im super new to DataHub, just hosting locally on Docker atm Ive set up some Tableau ingestions via the gui, can you programatically use every site in your server with code in the backend?
    h
    • 2
    • 3
  • n

    numerous-address-22061

    07/13/2023, 12:50 AM
    I ran our Tableau ingestion and the entities are in and look good, however it seems that none of the Embedded Data Sources were able to link upstream to their snowflake tables. Is this not an out of the box feature of the tableau ingestion? Almost every dashboard hooks up to Snowflake tables and we are hoping to see the lineage.
    m
    • 2
    • 1
  • n

    numerous-address-22061

    07/15/2023, 12:32 AM
    ^Seem to be close to having this working, however I am now noticing that a bunch of datasets were generated by the Tableau ingestion. These seem to be parsed from the Custom SQL, however the databases are off, it seems that because the Custom SQL doesnt use a fully qualified table name ex. select * from `analytics.table1`; datahub generates a dataset called
    analytics.analytics.table1
    and it cant figure out how to connect that Custom SQL to the actual snowflake Table which is
    database1.analytics.table1
    . How can I help it along here? Id rather it not map upstream at all than just guess and generate a dataset that sits in datahub. Ideally id get it to map back to its actual snowflake table (which is already in datahub).
    m
    a
    • 3
    • 5
  • f

    fast-xylophone-28117

    08/02/2023, 7:28 PM
    Hello everyone, We are trying to ingest data from the on-premise Tableau server. We made sure the user has all the permissions as specified in tableau pre-requisites here: https://datahubproject.io/docs/quick-ingestion-guides/tableau/setup but we are not able to ingest metadata yet wuing UI based ingestion recipe/scheduler. we keep getting this error, We know for sure username and passwords are correct, does anyone have idea on how do we resolve this SSL Certificate verify failed error?
    Copy code
    "tableau-login": [
              "Unable to login (check your Tableau connection and credentials): HTTPSConnectionPool(host='172.25.160.82', port=443): Max retries exceeded with url: /api/2.4/auth/signin (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:997)')))"
    With the same user, We can log in from UI fine. We also confirmed firewall is not blocking anything, the curl into the Tableau server IP address runs well from the datahub action pod. On the other hand, when I use exact same recipe on back end and run cli based ingestion manually, I get pass this error and get something else (some charts, tags, and projects ingested while some failed with this error.
    Copy code
    {
      "error": "Unable to emit metadata to DataHub GMS: java.lang.RuntimeException: Unknown aspect browsePathsV2 for entity container",
      "info": {
        "exceptionClass": "com.linkedin.restli.server.RestLiServiceException",
        "message": "java.lang.RuntimeException: Unknown aspect browsePathsV2 for entity container",
        "status": 500,
        "id": "urn:li:container:00eafb6262a384f1fc4e9582f576ba3d"
      }
    }
    h
    • 2
    • 2
  • n

    numerous-address-22061

    08/09/2023, 4:10 PM
    Hello every couple days or so I will see that the Tableau ingestion failed due to this error. It seems pretty random, has anyone seen this before?
    a
    g
    g
    • 4
    • 8
  • n

    numerous-address-22061

    08/17/2023, 5:13 PM
    I hate to copy this big of a trace into the chat but suddenly the last week since upgrading our tableau ingestion has been failing... has anyone seen a similar error?
    Copy code
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - [2023-08-17, 05:06:24 PDT] ERROR    {datahub.entrypoints:199} - Command failed: 'NoneType' object has no attribute 'get'
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - Traceback (most recent call last):
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/entrypoints.py", line 186, in main
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     sys.exit(datahub(standalone_mode=False, **kwargs))
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return self.main(*args, **kwargs)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     rv = self.invoke(ctx)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return _process_result(sub_ctx.command.invoke(sub_ctx))
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return _process_result(sub_ctx.command.invoke(sub_ctx))
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return ctx.invoke(self.callback, **ctx.params)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return __callback(*args, **kwargs)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return f(get_current_context(), *args, **kwargs)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 448, in wrapper
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     raise e
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/telemetry/telemetry.py", line 397, in wrapper
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     res = func(*args, **kwargs)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/utilities/memory_leak_detector.py", line 95, in wrapper
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return func(ctx, *args, **kwargs)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 198, in run
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     ret = loop.run_until_complete(run_ingestion_and_check_upgrade())
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     return future.result()
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 182, in run_ingestion_and_check_upgrade
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     ret = await ingestion_future
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 140, in run_pipeline_to_completion
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     raise e
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/cli/ingest_cli.py", line 132, in run_pipeline_to_completion
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     pipeline.run()
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/run/pipeline.py", line 367, in run
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for wu in itertools.islice(
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 119, in auto_stale_entity_removal
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for wu in stream:
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 143, in auto_workunit_reporter
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for wu in stream:
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 208, in auto_browse_path_v2
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for urn, batch in _batch_workunits_by_urn(stream):
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 346, in _batch_workunits_by_urn
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for wu in stream:
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 156, in auto_materialize_referenced_tags
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for wu in stream:
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/api/source_helpers.py", line 70, in auto_status_aspect
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     for wu in stream:
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2590, in get_workunits_internal
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     yield from self.emit_sheets()
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2028, in emit_sheets
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     yield from self.emit_sheets_as_charts(
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 2107, in emit_sheets_as_charts
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     project_luid: Optional[str] = self._get_workbook_project_luid(workbook)
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -   File "/usr/local/lib/python3.10/site-packages/datahub/ingestion/source/tableau.py", line 1438, in _get_workbook_project_luid
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO -     if wb.get(tableau_constant.LUID) and self.workbook_project_map.get(
    [2023-08-17, 05:06:24 PDT] {{pod_manager.py:235}} INFO - AttributeError: 'NoneType' object has no attribute 'get
    h
    • 2
    • 2
  • n

    numerous-address-22061

    08/17/2023, 5:14 PM
    related to
    workbook_project_map
    not being set to a value?
  • b

    brainy-musician-50192

    08/22/2023, 8:54 AM
    Hi guys, I ingested metadata from Snowflake and Tableau, running on most recent cli (0.10.5). What is already amazing is that lineage between Snowflake tables/views and Tableau data source objects was picked up. However, there is no column level lineage between Snowflake and Tableau. Is this expected behavior? I'm asking because one of the config options is:
    Copy code
    extract_column_level_lineage (boolean, default: true):
    When enabled, extracts column-level lineage from Tableau Datasources
    Does this mean lineage between tableau data source and tableau chart, and not between external table/view and tableau data source? If so, are there any future plans to add column lineage between Snowflake and Tableau?
    m
    h
    +2
    • 5
    • 14
  • s

    strong-author-11562

    08/31/2023, 9:45 PM
    Hi, I am getting this error when I am trying to ingest metadata from tableau
    log.txt
    • 1
    • 1
  • w

    worried-solstice-95319

    09/14/2023, 1:07 AM
    Has anyone run into this error and found a solution? I've checked that the proper credentials are flowing through but still running into the error.
    h
    • 2
    • 10
  • m

    magnificent-lock-58916

    09/19/2023, 10:33 AM
    Hello! Can you help me figure out, how to describe tableau entities in Airflow inlets/outlets code? So far, for other platforms we just used names or schema+names, and it worked fine, because schema+name combination is always unique But in case of Tableau, for example, we can have many datasources having the same name. So…what should we write instead? Datasourse URN? Or there’re other ways? If URN is the only option, i have two additional question: 1. Will entity’s URN remain the same even after renaming the entity inside Tableau, if we have stateful ingestion on? 2. Is there an easier way to search for entity URN? Like, if i know what kind of datasource airflow sends data to, searching for that entity in Datahub and copy-pasting its URN just doesn’t feel right. Like…what if we ever need to delete everything and ingest again on a clean state, it will likely produce different URNs and we gonna need to fix airfline pipelines outlets again UPD: we tried using tableau project ID, which would be IDEAL, but it doesn’t work UPD2: decided that this question belongs in troubleshoot channel https://datahubspace.slack.com/archives/C029A3M079U/p1695201260343699
  • m

    melodic-account-56198

    09/21/2023, 4:34 AM
    Hi, for tableau ingestion, is it necessary for site admin explorer role? We are a big organization and we cannot have access all tableau data in the reports. We just need the metadata for specific organizations. Is there a work around we can do or modify the native plug in?
    h
    • 2
    • 2
  • q

    quiet-kangaroo-60946

    10/04/2023, 5:20 PM
    Hello! We are trying to find the best way to load external descriptions for dashboards without using the UI. Has anyone tried anything that has worked well for them?
  • q

    quiet-arm-91745

    10/05/2023, 12:11 PM
    hi, how to connect lineage between tableau and bigquery? i'm able to ingest tableau data sources but it's lineage doesn't connect with bigquery this is my current config
    Copy code
    source:
        type: tableau
        config:
            stateful_ingestion:
                enabled: true
            connect_uri: 'tableau site'
            ingest_tags: true
            ingest_owner: true
            site: ""
            token_value: 'token'
            token_name: datahub
            extract_column_level_lineage: true
            extract_lineage_from_unsupported_custom_sql_queries: true
            extract_usage_stats: true
            ingest_embed_url: true
            ingest_tables_external: true
            page_size: 1
    sink:
        type: datahub-rest
        config:
            server: '<http://datahub-datahub-gms.datahub.svc.cluster.local:8080>'
            max_threads: 1
    thanks in advance
  • b

    bulky-shoe-65107

    10/16/2023, 12:42 AM
    has renamed the channel from "integration-tableau-datahub" to "integrate-tableau-datahub"
  • d

    damp-computer-35583

    01/18/2024, 5:42 PM
    Hello all, I am using version
    acryl-datahub, version 0.12.1.3
    via the quickstart docker setup based on the 0.12.0 version, and I have tried to extract the column level lineage from the custom sql queries. But I don't seem to be getting any of that metadata back. Here is my yaml.
    Copy code
    source:
        type: tableau
        config:
            connect_uri: '<https://tableau.example.com>'
            stateful_ingestion:
                enabled: false
            username: 'user'
            password: 'password'
            ingest_owner: true
            extract_lineage_from_unsupported_custom_sql_queries: true
            ingest_tags: true
            extract_usage_stats: true
            ingest_tables_external: true
            projects:
                - 'sales
    • 1
    • 5
  • d

    damp-computer-35583

    01/18/2024, 9:39 PM
    I tried loading up the sql server metadata, but now I have two of the same dataset, tableau points from the custom sql query to the tableau db, sql server points the the real one.
  • d

    damp-computer-35583

    01/18/2024, 9:41 PM
    one dataset is under the tableau container
  • d

    damp-computer-35583

    01/18/2024, 9:41 PM
    the other is under the sql server container:
  • d

    damp-computer-35583

    01/18/2024, 9:42 PM
    is this the expected behaviour, and is there any way to get the tableau custom sql to map to the real sql server table, with column level lineage ?
    h
    • 2
    • 2
  • d

    damp-computer-35583

    01/18/2024, 9:43 PM
    thanks much
  • a

    able-artist-56392

    02/08/2024, 10:00 AM
    Hello there, Tell me if my query is not in the good channel. I'm currently working on Tableau Online ingestion in Datahub v0.12.0 and I Have 2 issues first issue is : on a small part of Tableau Cloud site I am facing a problem with the ingestion. When the stateful_ingestion parameter is enabled, the ingestion fails with the error below:
    Message: 'Failed to commit changes for DatahubIngestionCheckpointingProvider.'
    When disabling stateful_ingestion parameter, the ingestion works successfully, but another problem appears. dashboards are not deleted from datahub if they are not anymore present in Tableau. I see this parameter remove_stale_metadata that allow to remove deleted dahsboard but it is linked to the stateful_ingestion. how can I solve it? second issue is : When trying to make a full ingestion of our Tableau cloud site, ingestion fails with this warning "embeddedDatasourcesConnection": [
    "[{'locations': None, 'message': 'Showing partial results. The request exceeded the 20000 node limit. Use pagination, additional filtering, or both in the query to adjust results.', 'errorType': None, 'extensions': {'severity': 'WARNING', 'code': 'NODE_LIMIT_EXCEEDED', 'properties': {'nodeLimit': 20000}}, 'path': None}]",
    and theses error rising up and appears in the beguining of the ingestion log file
    {
    "error": "Unable to emit metadata to DataHub GMS",
    "info": {
    "message": "502 Server Error: Bad Gateway for url: <https://datahub-gms>..../aspects?action=ingestProposal",
    "id": "urn:li:chart:(tableau,...)"
    }
    },
    {
    "error": "Unable to emit metadata to DataHub GMS",
    "info": {
    "message": "403 Client Error: Forbidden for url: <https://datahub-gms>...../entities?action=ingest",
    "id": "urn:li:chart:(tableau,...)"
    }
    }
    (urn have been masked as the url of my datahub site) I don't have the exact time needed for ingestion but it takes at least 45 minuts. What can be the causes for theses errors and what actions can I do to try to correct theses errors?
    h
    • 2
    • 6
  • h

    handsome-planet-77266

    02/28/2024, 6:47 PM
    Hi Team, we just started datahub and successfully able to make connections to Snowflake. Trying the same with Tableau server, but getting errors while making connections. I am tableau server admin and using my credentials to make connections, and its throwing the below error. Can anyone suggest what might be missing?
    h
    • 2
    • 1