Hello everyone! I am trying to ingest metadata fr...
# troubleshoot
r
Hello everyone! I am trying to ingest metadata from our Tableau Server, which requires trusted CA certificates deployed. I did deploy them on the Linux machine, but it might require having them in the keystore of the running containers of Datahub, but I am not aware how to do that.
Copy code
'failures': {'tableau-login': ["Unable to LoginReason: HTTPSConnectionPool(host='172.22.5.19', port=443): Max retries exceeded with url: /api/2.4/serverInfo (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1125)')))"]},
Any experience with this?
e
@gray-shoe-75895 would you be able to help here?
g
@red-analyst-79902 thanks for flagging this. We don’t have a mechanism for that right now, but I’ve opened this PR https://github.com/datahub-project/datahub/pull/6172 to add support for it
r
Hi @gray-shoe-75895, I upgraded to the latest version and I am trying to use this ssl_verify as an option in the receipt, but I still get certificates errors. I am not sure I am using it properly though...
source:
type: tableau
config:
# Coordinates
connect_uri: https:// ...
site: ""
projects: ["DWH"]
# Credentials
username: tableau
password: ...
# Options
ssl_verify: path_to_certificates
ingest_tags: True
ingest_owner: True
default_schema_map:
mydatabase: public
anotherdatabase: anotherschema
sink:
type: "datahub-rest"
config:
server: "<http://localhost:8080>"
[2022-11-18 09:23:17,581] DEBUG    {datahub.telemetry.telemetry:210} - Sending init Telemetry
[2022-11-18 092318,076] DEBUG {datahub.telemetry.telemetry:221} - Error initializing telemetry: HTTPSConnectionPool(host='track.datahubproject.io', port=443): Max retries exceeded with url: /mp/engage (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1125)'))) [2022-11-18 092318,077] DEBUG {datahub.telemetry.telemetry:243} - Sending Telemetry [2022-11-18 092318,573] DEBUG {datahub.telemetry.telemetry:248} - Error reporting telemetry: HTTPSConnectionPool(host='track.datahubproject.io', port=443): Max retries exceeded with url: /mp/track (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1125)'))) [2022-11-18 092318,574] INFO {datahub.cli.ingest_cli:167} - DataHub CLI version: 0.9.2.2 [2022-11-18 092318,582] DEBUG {datahub.telemetry.telemetry:243} - Sending Telemetry [2022-11-18 092319,076] DEBUG {datahub.telemetry.telemetry:248} - Error reporting telemetry: HTTPSConnectionPool(host='track.datahubproject.io', port=443): Max retries exceeded with url: /mp/track (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1125)'))) [2022-11-18 092319,200] ERROR {datahub.entrypoints:185} - File "/home/mmmstz013/gmarin/.local/lib/python3.8/site-packages/datahub/entrypoints.py", line 164, in main 161 def main(**kwargs): 162 # This wrapper prevents click from suppressing errors. 163 try: --> 164 sys.exit(datahub(standalone_mode=False, **kwargs)) 165 except click.Abort: File "/home/mmmstz013/gmarin/.local/lib/python3.8/site-packages/click/core.py", line 1130, in call 1128 def __call__(self, *args: t.Any, **kwargs: t.Any) -> t.Any: (...) --> 1130 return self.main(*args, **kwargs)
g
Could you paste a bit more of the bottom part of the stack trace?
also, it looks like it’s failing to send telemetry because of your internal cert setup. You can disable that by setting
DATAHUB_TELEMETRY_ENABLED=0
r
Full log
Where do I set TELEMETRY up?
g
Are you using
datahub ingest
or configuring ingestion through the UI?
DATAHUB_TELEMETRY_ENABLED
is an environment variable that you can set if you’re not using UI ingestion
r
I am using datahub ingest
g
The tableau source now supports a
ssl_verify
flag. If set to false, it disables ssl verification altogether. If set to a path, it will be passed to python’s requests as the path of the system certs. I’d recommend using
false
for now, and then setting up proper verification if that works
r
After setting to False I get the same error, askinf for a certificate.
g
What’s the exact error that you see when setting
ssl_verify
to false?
r
Here you are logged in the txt.
g
That file looks empty 😞
r
Sorry for that. Here you are.
g
That stack trace indicates an error in the formatting of your yaml file. Could you try running your yaml file through a yaml validator
r
Yes, there was something. New output.
---
source: type: tableau config: connect_uri: https:/// site: "" projects: - username: password: ssl_verify: false ingest_tags: true ingest_owner: true default_schema_map: mydatabase: public anotherdatabase: anotherschema sink: type: datahub-rest config: server: http://localhost:8080
g
Looks like there’s a bad proxy config somewhere in your environment?
Copy code
Caused by ProxyError('"'"'Cannot connect to proxy.'"'"', OSError('"'"'Tunnel connection failed: 502 Bad Gateway'"'"')))\\\"\\n
r
Hmm is it possible this needs to be set inside the container?
Why does it need to go through the proxy?
g
It doesn’t need to go through a proxy, but it looks like some environment variable is causing it to use a proxy
h
@red-analyst-79902 -- How did you solve this? I am getting similar error for kafka