https://datahubproject.io logo
Join Slack
Powered by
# integrate-tableau-datahub
  • h

    handsome-belgium-11927

    09/30/2021, 8:05 PM
    That is what I have already found out. In Tableau our users are mostly using 4 entities: Workbook, Dashboard, View, Datasource. In terms of Datahub Dashboard is a Dashboard, View is a Chart, and Datasource can be represented as a Dataset (with some assumptions). No correspondence for Workbook, which is an entity with several Dashboards and Datasources. About ingestion - I'm using tableauserverclient python module to get the data, Tableau Server allows you to get some meta-information directly from the objects (names, tags, urls and so on), but for additional information (like list of datasources and sql code) I had to download xml-files and parse them. It requires admin user and a generated token. P.S.: I've never used Tableau before and it brings me a lot of pain, and I may be missing something obvious 😅 Hope that starts some action cat on keyboard
    g
    b
    • 3
    • 8
  • q

    quiet-kilobyte-82304

    09/30/2021, 9:05 PM
    @handsome-belgium-11927 Have you tried this? go to your tableau graphql page
    your-tableau-url/metadata/graphiql
    and run this. There are a few examples here that you can use. You can then map the o/p to DataHub’s chart or dashboard schema. We’re trying to add some automation around this process. I’ll be happy to share when we have something tangible. But the below query should get you started with ownership and datasources.
    Copy code
    query ShowMeWorkbookEmbeddedFilters {
      workbooks(filter: { name: "your-workbook-name" }) {
        name
        owner {
          id
          name
        }
        embeddedDatasources {
          name
          upstreamTables {
            id
            name
          }
          upstreamDatabases {
            id
            name
          }
        }
      }
    }
    👀 1
    h
    m
    • 3
    • 14
  • l

    lemon-airline-17821

    10/08/2021, 5:01 PM
    Maybe it would be helpful to add the
    tags
    connected to a Tableau Workbook. We use this very often to identify the metric which is shown in the report/charts/tables.
    m
    • 2
    • 2
  • h

    handsome-belgium-11927

    10/13/2021, 1:37 PM
    Is it possible to add support for Dataset->Dashboard and Dashboard->Dashboard lineage? I think it is a much easier workaround than creating a new entity (Workbook). With this feature we can create Workbooks as Dashboards and see all the relationships with other entities.
    plus1 1
    l
    • 2
    • 1
  • r

    rapid-sundown-8805

    10/28/2021, 6:59 AM
    Hi all, I created a feature request on this without knowing about this channel. Our requirements are basically an ingestion plugin for Tableau Datasources, which we want our users to be able to find in DataHub alongside all other data. Dashboards are secondary for us, but it should be possible through the same REST-API to extract this.
    • 1
    • 1
  • s

    sparse-planet-56664

    11/10/2021, 1:36 PM
    Someone gotten far enough to actually test this? We are currently evaluating Tableau, and it would be great to test this feature for it 🙂
    r
    • 2
    • 1
  • l

    little-megabyte-1074

    11/16/2021, 5:10 PM
    <!here> Hello, DataHub <> Tableau enthusiasts -- Very exciting news — @powerful-vase-22055 is going to take the lead on building out a Tableau integration for the broader DataHub community!! I’m here to help with requirements/tradeoffs/user stories - all of the non-dev-stuff. The goal will be to provide the same Dataset/Chart/Dashboard support that we have for for other BI tools (Looker, Superset, Redash, etc) to represent common dashboard/reporting/analytics patterns for Tableau users. We need your input/collaboration to make sure we get this right - are you able to help over the next 2-4 weeks? 1️⃣ I’ve deployed a custom integration in my organization and I will share my experience 2️⃣ I will help in the design/implementation requirements 3️⃣ I am available to team up and build the integration together
    1️⃣ 3
    2️⃣ 8
    3️⃣ 4
    h
    a
    s
    • 4
    • 5
  • h

    high-family-71209

    12/06/2021, 9:53 AM
    @little-megabyte-1074 - out of curiosity: How is the progress here? 🙂
    l
    • 2
    • 1
  • l

    little-megabyte-1074

    12/15/2021, 9:39 PM
    <!here> Hello, all! I just chatted with @powerful-vase-22055 - he’s officially kicking off Tableau design/development work this week! This is the 3rd most frequently requested feature within the community, so I’m super excited to get some movement! Here’s where we’re looking for your help over the next week: 1. Share your custom integration code for inspiration/reference - if you have a custom Tableau <> DataHub integration in your organization, it will be very helpful to learn how you approached it; @handsome-belgium-11927 I know you had an… interesting(?) experience here, so even hearing about lessons learned would be great! 2. Contribute to design/implementation requirements - we’d love your help understanding the core components/concepts within Tableau, what metadata is available for us to extract, how they all fit together, & how we can map it back to DataHub. a. We currently have 7 “concepts” listed & I added in a basic template of details we’re looking for; please feel free to add any concepts we missed! If you’re free to help with design/development over the next 1-2wks, please let me know in the thread & I’ll facilitate some working sessions. Looking forward for all of your input!!
    🌯 1
    🔥 3
    🌮 2
    e
    h
    +7
    • 10
    • 22
  • b

    brainy-wall-41694

    02/09/2022, 11:20 AM
    Hi! I would like to know if there is any place where bugs that are being fixed are being tracked. Yesterday the Beta release of Tableau Ingestion Source came out and I'm testing it. I found some problems, but most of it is already working very well. Just so I don't report issues that you guys are already working on.
    l
    • 2
    • 2
  • l

    little-megabyte-1074

    02/09/2022, 6:01 PM
    <!channel> Hello, tableau + datahub fans!! As you may have seen in #announcements, we just released the Beta Tableau ingestion source in v0.8.26! This has been the most-requested ingestion source for a looooong time - huge shoutout to @colossal-easter-99672 for providing initial code; to @powerful-vase-22055 for the first pass; and to @big-carpet-38439 & @dazzling-judge-80093 for taking it across the finish line teamwork bowdown We are extremely eager for quick feedback while it’s all fresh in our minds so we can iterate through it quickly. If you are able to test it out within the next 1-2 weeks, please do! Let’s use this channel for feedback/discussion & we will open PRs as necessary.
    🙌 4
    e
    • 2
    • 1
  • b

    brainy-wall-41694

    02/09/2022, 6:24 PM
    Hi! I did some tests this morning and the main problem I got was the following:
    datahub/ingestion/source/tableau.py", line 651, in emit_sheets_as_charts
    dashboard_path = sheet.get("containedInDashboards")[0].get("path", "")
    IndexError: list index out of range
    This happens during ingestion and only with some dashboards/sheets. I still haven't figured out why some only have problems. Here's a more complete log
    log.txt
    c
    e
    • 3
    • 5
  • r

    rich-policeman-92383

    02/10/2022, 7:45 AM
    Hello While connecting datahub to tableau using Token results in NotSignedInError: Missing Authentication Token. The same authentication token works while using this python code Config used during ingestion:
    Copy code
    source:
      type: tableau
      config:
        connect_uri: <https://mytableau.com>
        site: Default
        env: "Test"
        #projects: ["default", "Project 2"]
        #username: <mailto:username@acrylio.com|username@acrylio.com>
        #password: pass
        token_name: "root"
        token_value: "fnsdifnsdkfnsdjfn8r834hr837fb"
        #ingest_tags: True
        ingest_owner: True
        #default_schema_map:
        #  mydatabase: public
        #  anotherdatabase: anotherschema
    sink:
      type: "datahub-rest"
      config:
        server: "<http://localhost:8080>"
    b
    s
    • 3
    • 5
  • c

    colossal-easter-99672

    02/10/2022, 8:23 AM
    Hello, team. @powerful-vase-22055@big-carpet-38439@dazzling-judge-80093 Can you tell about tableu object - datahub entity mapping? Watching code now and see this
    Workbook - None ?
    Dashboard - Dashboard
    Sheet - Chart
    Datasource - Dataset
    Custom SQL - Dateset
    Is it right? There are some problems with that: 1. All users work in tableau with workbooks and without it users will have navigation problems. 2. Sheet can exist in Workbook without Dashboard.
    d
    e
    +4
    • 7
    • 23
  • b

    brainy-wall-41694

    02/10/2022, 10:50 AM
    Hi! I would like to understand if it was a problem with my configuration or if it is some adjustment in the Tableau plugin. It seems to me that Data Source, for some reason, didn't get the name right.
    • 1
    • 2
  • c

    careful-insurance-60247

    02/10/2022, 3:13 PM
    Trying to ingest data off our dev tableau. Looks like the ingestion is running but no data is returned.
    Copy code
    [ec2-user@ip-10-16-13-173 recipes]$ datahub --debug ingest -c tableau.yml
    [2022-02-10 15:09:20,381] DEBUG    {datahub.cli.ingest_cli:70} - DataHub CLI version: 0.8.26.1
    [2022-02-10 15:09:20,385] DEBUG    {datahub.cli.ingest_cli:76} - Using config: {'source': {'type': 'tableau', 'config': {'connect_uri': '<removed>', 'site': '1053', 'token_name': 'datahub', 'token_value': '<removed>', 'ingest_tags': True, 'ingest_owner': True}}, 'sink': {'type': 'datahub-rest', 'config': {'server': '<removed>'}}}
    [2022-02-10 15:09:20,873] INFO     {tableau.endpoint.auth:37} - Signed into , <removed> as user with id a1592316-07c5-422f-9cc9-9ea5104646fa
    [2022-02-10 15:09:20,873] DEBUG    {datahub.ingestion.run.pipeline:124} - Source type:tableau,<class 'datahub.ingestion.source.tableau.TableauSource'> configured
    [2022-02-10 15:09:20,886] DEBUG    {datahub.ingestion.run.pipeline:130} - Sink type:datahub-rest,<class 'datahub.ingestion.sink.datahub_rest.DatahubRestSink'> configured
    [2022-02-10 15:09:20,886] INFO     {datahub.cli.ingest_cli:86} - Starting metadata ingestion
    [2022-02-10 15:09:20,886] INFO     {tableau.endpoint.metadata:61} - Querying Metadata API
    [2022-02-10 15:09:21,057] INFO     {tableau.endpoint.auth:53} - Signed out
    [2022-02-10 15:09:21,058] INFO     {datahub.cli.ingest_cli:88} - Finished metadata ingestion
    
    Source (tableau) report:
    {'failures': {}, 'warnings': {}, 'workunit_ids': [], 'workunits_produced': 0}
    Sink (datahub-rest) report:
    {'downstream_end_time': None,
     'downstream_start_time': None,
     'downstream_total_latency_in_seconds': None,
     'failures': [],
     'records_written': 0,
     'warnings': []}
    
    Pipeline finished successfully
    e
    • 2
    • 5
  • e

    echoing-dress-35614

    02/10/2022, 3:26 PM
    Nested projects are also not ingested, only workbooks that are direct descendents of top-level projects
    plus1 1
    c
    • 2
    • 1
  • e

    echoing-dress-35614

    02/10/2022, 3:28 PM
    We have pretty heavily nested projects/folders in our tableau sites. following a project/owner/dashboard/sheet structure
    h
    • 2
    • 2
  • e

    echoing-dress-35614

    02/10/2022, 3:35 PM
    getting an index out of range error when ingesting a workbook from a specific project:
    Copy code
    myuser@PEI6433:~/projects/datahub (master #%)$ datahub --debug ingest -c tableau-patrick.yml
    [2022-02-10 09:11:29,272] DEBUG    {datahub.cli.ingest_cli:70} - DataHub CLI version: 0.8.26.1
    [...SNIP...]
        pipeline.run()
    File "/home/myuser/.local/share/virtualenvs/datahub-uFRi8XjQ/lib/python3.8/site-packages/datahub/ingestion/run/pipeline.py", line 181, in run
        for wu in itertools.islice(
    File "/home/myuser/.local/share/virtualenvs/datahub-uFRi8XjQ/lib/python3.8/site-packages/datahub/ingestion/source/tableau.py", line 807, in get_workunits
        yield from self.emit_workbooks(10)
    File "/home/myuser/.local/share/virtualenvs/datahub-uFRi8XjQ/lib/python3.8/site-packages/datahub/ingestion/source/tableau.py", line 197, in emit_workbooks
        yield from self.emit_sheets_as_charts(workbook)
    File "/home/myuser/.local/share/virtualenvs/datahub-uFRi8XjQ/lib/python3.8/site-packages/datahub/ingestion/source/tableau.py", line 651, in emit_sheets_as_charts
        dashboard_path = sheet.get("containedInDashboards")[0].get("path", "")
    
    IndexError: list index out of range
    myuser@PEI6433:~/projects/datahub (master #%)$
    tableauRunLog.txt
    e
    • 2
    • 1
  • b

    brainy-wall-41694

    02/14/2022, 2:15 PM
    Hi! Another thing I noticed is that when the data sources are imported, the tableau hierarchies are also imported. However, it comes without the respective links.
    thanks ewe 1
    l
    • 2
    • 3
  • b

    brainy-wall-41694

    02/14/2022, 2:34 PM
    Hi again! 😅 I saw that dashboards have the wrong URL(View in Tableau). Here's an example: Wrong URL: https://server/#/site//views/5_CUSTOCOVIDATNOVEMBRO/PainelCOVID Real URL: https://server/#/views/5_CUSTOCOVIDATNOVEMBRO/PainelCOVID
    thanks ewe 1
    c
    • 2
    • 2
  • s

    shy-parrot-64120

    02/14/2022, 8:48 PM
    Great job Seems like current ingestor supports only Online version of Tableau how can i connect to on-prem server?
    Copy code
    File "/usr/local/lib/python3.9/site-packages/tableauserverclient/server/server.py", line 161, in auth_token
        raise NotSignedInError(error)
    
    NotSignedInError: Missing authentication token. You must sign in first.
    e
    • 2
    • 2
  • s

    shy-parrot-64120

    02/14/2022, 9:14 PM
    Another thing based on code in case of smth wrong with pairs of auth attributes: username+password / token_name+token_value (one of pair item is missing) -
    "No valid authentication was found"
    error should be generated however even none of 4 attributes are set i’l get
    Copy code
    NotSignedInError: Missing authentication token. You must sign in first.
    seems like default values are passed somehow and authentication is made with some unpredictable values and silently fails
    b
    e
    r
    • 4
    • 6
  • s

    shy-parrot-64120

    02/14/2022, 9:32 PM
    Also please add possibility to set list of sites for ingestion instead of one
    l
    • 2
    • 1
  • e

    early-article-88153

    02/14/2022, 11:30 PM
    Hi folks! I've been playing with the Tableau source for DataHub for a couple of days now and have some feedback: 🧵
    l
    • 2
    • 13
  • s

    shy-parrot-64120

    02/15/2022, 10:11 AM
    Another issue has been observed
    Copy code
    Source (tableau) report:
    {'workunits_produced': 0,
     'workunit_ids': [],
     'warnings': {},
     'failures': {'tableau-metadata': ["Unable to retrieve metadata from tableau. Information: Connection: workbooksConnection Error: [{'message': "
                                       "'Showing partial results. The request exceeded the 20000 node limit. Use pagination, additional filtering, or "
                                       "both in the query to adjust results.', 'extensions': {'severity': 'WARNING', 'code': 'NODE_LIMIT_EXCEEDED', "
                                       "'properties': {'nodeLimit': 20000}}}]"]}}
    pagination needed
    c
    b
    • 3
    • 5
  • c

    careful-insurance-60247

    02/18/2022, 3:44 PM
    Seeing an issue with the tableau source.
    Copy code
    Traceback (most recent call last):
      File "/usr/local/bin/datahub", line 8, in <module>
        sys.exit(datahub())
      File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1053, in main
        rv = self.invoke(ctx)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
      File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
        return ctx.invoke(self.callback, **ctx.params)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/click/core.py", line 754, in invoke
        return __callback(*args, **kwargs)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/telemetry/telemetry.py", line 196, in wrapper
        raise e
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/telemetry/telemetry.py", line 190, in wrapper
        res = func(*args, **kwargs)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 87, in run
        pipeline.run()
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/run/pipeline.py", line 182, in run
        self.source.get_workunits(), 10 if self.preview_mode else None
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/tableau.py", line 807, in get_workunits
        yield from self.emit_workbooks(10)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/tableau.py", line 197, in emit_workbooks
        yield from self.emit_sheets_as_charts(workbook)
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/tableau.py", line 657, in emit_sheets_as_charts
        get_field_value_in_sheet(field, "description"),
      File "/home/ec2-user/.local/lib/python3.7/site-packages/datahub/ingestion/source/tableau_common.py", line 423, in get_field_value_in_sheet
        field_value = field.get("remoteField", {}).get(field_name, "")
    AttributeError: 'NoneType' object has no attribute 'get'
    • 1
    • 1
  • a

    acoustic-quill-54426

    02/23/2022, 12:25 PM
    I've just started testing this integration today and it's really promising! I did stumble across the same issue as @careful-insurance-60247 and @early-article-88153 (many thanks for the thorough feedback) related to the ingestion of sheets external urls and I've submitted a PR
    thank you 2
    👍 1
    c
    • 2
    • 2
  • h

    hundreds-photographer-13496

    03/07/2022, 7:47 AM
    Hello, this tableau PR was recently merged into master - https://github.com/linkedin/datahub/pull/4261 Do give it a spin. Excited to hear the feedback.
    thank you 1
    b
    e
    • 3
    • 3
  • b

    brainy-wall-41694

    03/10/2022, 9:01 PM
    Hey guys! I was doing a load testing the adjustments made in Tableau and I identified something a little strange. It seems to me that the ingester has duplicated several datasources. In the image, the charts in question have only one datasource, but there appear two. Another important thing to comment on is that the datasource linked with ODBC doesn't have the correct field descriptions.
    l
    h
    +2
    • 5
    • 18
12Latest