Hey Team, I am trying to ingest Tableau Metadata a...
# integrate-tableau-datahub
c
Hey Team, I am trying to ingest Tableau Metadata and running into some snags. I've created a personal access token and my recipe looks like this...
Copy code
source:
  type: tableau
  config:
    # Coordinates
    connect_uri: <https://tableautest/#/home>
    site:
    projects: ["HOSPITALS"]

    # Credentials
    token_name: JGTest
    token_value: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

    # Options
    ingest_tags: True
    ingest_owner: True
    default_schema_map:
      mydatabase: public
      anotherdatabase: anotherschema

sink:
    type: datahub-rest
    config:
        server: '<http://datahub-gms:8080>'
I have read through the debug log, but have not really found anything meaningful other than the generic message at the bottom stating... ConnectionError: HTTPConnectionPool(host='datahub-gms', port=8080): Max retries exceeded with url: /config (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f14da1cdf50>: Failed to establish a new connection: [Errno -2] Name or service not known')) I've also attached my debug log, Thanks!
m
Can you resolve datahub-gms from your console?
c
Hey Steve, how would I do that?
m
Just do a normal ping from the machine you run the ingestion. Is it a production datahub or your local setup?
c
That's what I tried and it did not resolve
[joshua.garza@ip-10-4-64-11 quickstart]$ ping http://datahub-gms:8080 ping: http://datahub-gms:8080: Name or service not known
This is a local dev poc environment
m
You run datahub docker quickstart?
c
yes
m
And the ingestion is running outside of docker? Can you try to use localhost:8080 instead of datahub-gms?
c
yes, let me try
m
Yeah you have to run with localhost:8080, because your machine is not part of the docker network, so it cannot resolve datahub-gms host
c
InvalidSchema: No connection adapters were found for 'localhost:8080/config'
but, I'm also checking some other stuff
m
Looks like you have a different service running on port 8080, can you open browser and visit that url?
No visit localhost:8080/config
And can you post the full log with the new ingestion?
c
That's weird, I get nothing at that url, other ingestions have been working, for example looker and snowflake
thank you 1
m
Ok... Can you post the full log?
And which version of datahub cli are you running?
I think your ingestions of snowflake and looker were done through UI right?
c
actually, snowflake through the UI and Looker through the cli
m
Hm so if you use the similar receipe as looker it should work.
c
I will try
[joshua.garza@ip-10-4-64-11 quickstart]$ datahub --version acryl-datahub, version 0.8.44
m
Can you show me your config again?
c
Copy code
source:
  type: tableau
  config:
    # Coordinates
    connect_uri: <https://tableautest/#/home>
    site:
    projects: ["HOSPITALS"]

    # Credentials
    token_name: JGTest
    token_value: ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ

    # Options
    ingest_tags: True
    ingest_owner: True
    default_schema_map:
      mydatabase: public
      anotherdatabase: anotherschema

sink:
    type: datahub-rest
    config:
      #server: '<http://datahub-gms:8080>'
      server: 'localhost:8080'
I got the recipe from the sample.
m
Ah server should be http://localhost:8080
c
better error now .. [2022-09-27 024319,866] ERROR {datahub.entrypoints:196} - Command failed: Failed to configure source (tableau) due to '1 validation error for TableauConfig
none is not an allowed value (type=type_error.none.not_allowed)
m
Oh you have to have site
c
okay, let me see what that should be for an on prem instance
I just added acryl as the site name, and I got an ingestion that tried. but a lot better progress now... [joshua.garza@ip-10-4-64-11 quickstart]$ datahub ingest -c tableau_test.yaml [2022-09-27 024909,984] INFO {datahub.cli.ingest_cli:179} - DataHub CLI version: 0.8.44 [2022-09-27 024910,042] INFO {datahub.ingestion.run.pipeline:165} - Sink configured successfully. DataHubRestEmitter: configured to talk to http://localhost:8080 [2022-09-27 024910,110] ERROR {datahub.ingestion.source.tableau:254} - HTTPSConnectionPool(host='tableautest', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa725943450>: Failed to establish a new connection: [Errno -2] Name or service not known')) [2022-09-27 024910,110] INFO {datahub.ingestion.run.pipeline:190} - Source configured successfully. [2022-09-27 024910,112] INFO {datahub.cli.ingest_cli:126} - Starting metadata ingestion -[2022-09-27 024910,180] INFO {datahub.cli.ingest_cli:147} - Finished metadata ingestion Cli report: {'cli_entry_location': '/home/joshua.garza/.local/lib/python3.7/site-packages/datahub/__init__.py', 'cli_version': '0.8.44', 'os_details': 'Linux-4.14.287-215.504.amzn2.x86_64-x86_64-with-glibc2.2.5', 'py_exec_path': '/usr/bin/python3', 'py_version': '3.7.10 (default, Jun 3 2021, 000201) \n[GCC 7.3.1 20180712 (Red Hat 7.3.1-13)]'} Source (tableau) report: {'event_ids': [], 'events_produced': '0', 'events_produced_per_sec': '0', 'failures': {'tableau-login': ["Unable to LoginReason: HTTPSConnectionPool(host='tableautest', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa725943450>: Failed to establish a new connection: [Errno -2] Name or service not known'))"]}, 'read_rate': '0', 'running_time_in_seconds': '0', 'start_time': '2022-09-27 024910.107750', 'warnings': {}} Sink (datahub-rest) report: {'current_time': '2022-09-27 024910.283078', 'failures': [], 'gms_version': 'v0.8.45', 'pending_requests': '0', 'records_written_per_second': '0', 'start_time': '2022-09-27 024909.301465', 'total_duration_in_seconds': '0.98', 'total_records_written': '0', 'warnings': []} Pipeline finished with at least 2 failures ; produced 0 events in 0 seconds.
m
Site name is tableau site name.
And you dont have connection to tableau server from your machine
c
got it.
Thanks so much @modern-artist-55754! I am going to talk to my Tableau admin tomorrow to figure out what site names are in place.
Hi Steve, I am just getting back to this. When you state I don't have a connection to Tableau from my machine, doesn't this not matter if I am on VPN?
m
From the machine u run the ingestion, can you visit https://tableautest? It looks like an internal domain, which your internal dns can understand... I suspect you need vpn
c
Thanks Steve. I was not able to telnet to this endpoint (got the IP by pinging the domain). I think I need to talk to the network admins to get a firewall rule in place to allow me in.