Hi team, I am running datahub Kubernetes Services(...
# getting-started
s
Hi team, I am running datahub Kubernetes Services(AKS) using Helm chart. I can access the UI but I am not able to ingest the data. How do I find out REST endpoint for ingestion? My gms pod and service are up and running. As per documentation DataHub REST endpoint is http://localhost:8080 but I am running in pod. Any help to debug? Thanks.
e
Hey. Did you expose your frontend through ingress? You can try using <<domain-to-datahub>>/api
frontend proxies /api requests to gms directly
otherwise, you will need to expose gms through ingress as well
s
Thanks for the reply. Yes, I did expose gms through ingress but when I try to hit the endpoint I get this error`{"exceptionClass":"com.linkedin.restli.server.RestLiServiceException","stackTrace":"com.linkedin.restli.server.RestLiServiceException [HTTP Status:404]\n\tat com.linkedin.restli.server.RestLiServiceException.fromThrowable(RestLiServiceException.java:315)\n\tat com.linkedin.restli.server.BaseRestLiServer.buildPreRoutingError(BaseRestLiServer.java:202)\n\tat`
404 error
e
could you
curl <<domain-gms>>/config
?
Also do you mind posting the full error msg and the logs you see in gms pod?
s
Sorry, I am kind of new. I don't see anything in gms logs, it seems like its not even hitting the pod. How do I directly hit the gms pod to test it?
e
Try the above curl!
curl “/config” endpoint
s
below is my pod, curl d*atahub-datahub-gms-7846d7cd95-wffrs/config*? I get '*Could not resolve host*' which is expected.
@early-lamp-41924 I am still trying to debug. Please find error msg and gms logs.
e
When you expose gms through ingress, you should have some URL to point your queries towards. The reason for having an ingress is to expose certain ports to the service to the internet, usually by having a load balancer on top of it. Try to look for the URL that is associated with the ingress you set up. <<pod-name>> is definitely not the URL
Doc on kubernetes ingress in case this helps to give more context here!! https://kubernetes.io/docs/concepts/services-networking/ingress/#what-is-ingress
s
Thank you for helping @early-lamp-41924. Yes, that is correct I have the ingress and load balancer setup and I tried to hit the gms endpoint and got the above error.
For debugging purpose I also tried to setup datahub locally using docker and I am having same issue. It was working last week on my local running as container and I was able to ingest using PostgreSQL as source. I am thinking latest datahub might have some issues or may be I am missing something.
Step1 $ datahub version DataHub CLI version: 0.8.36 Python version: 3.8.10 (default, Mar 15 2022, 122208) [GCC 9.4.0] Step2 $datahub docker quickstart ....... ✔️ DataHub is now running Ingest some demo data using
datahub docker ingest-sample-data
, or head to http://localhost:9002 (username: datahub, password: datahub) to play around with the frontend. Step3 Login http://localhost:9002/ Step4 Create new source for Ingestion in the UI source: type: postgres config: host_port: 'test' database: test username: test password: test include_tables: true include_views: true profiling: enabled: false sink: type: datahub-rest config: server: 'http://localhost:8080'
Error msg
"ConnectionError: HTTPConnectionPool(host='localhost', port=8080): Max retries exceeded with url: /config (Caused by "
"NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4df582b550>: Failed to establish a new connection: [Errno 111] "
Full error msg is attached