Hello, My datahub instance is running on GKE. I am...
# troubleshoot
p
Hello, My datahub instance is running on GKE. I am getting started with dbt (not dbt-cloud). It seems that when running the ingestion, I am getting an issue connecting to GMS. Upon doing some troubleshooting, it seems that my machine is not able to connect to the GMS server.
Copy code
DataHub CLI version: 0.10.2.1
Failed to set up framework context: Failed to connect to DataHub
When I am looking at the gms-logs (kubctl logs for gms server) - I am getting the below error
Copy code
[ThreadPoolTaskExecutor-1] WARN o.apache.kafka.clients.NetworkClient:969 - [Consumer clientId=consumer-mce-consumer-job-client-1, groupId=mce-consumer-job-client] Error connecting to node prerequisites-kafka-0.prerequisites-kafka-headless.default.svc.cluster.local:9092 (id: 0 rack: null)
What is the next course of access? It seems like gms is not able to communicate with kafka
1
Also, do we need to explicitly build the metadata service (locally) for cli-ingestion?
Copy code
---
# see <https://datahubproject.io/docs/generated/ingestion/sources/dbt> for complete documentation
source:
  type: "dbt"
  config:
    manifest_path: "./dbt_manifest.json"
    catalog_path: "./dbt_catalog.json"
    sources_path: "./dbt_sources.json"
    target_platform: "gbq"
# see <https://datahubproject.io/docs/metadata-ingestion/sink_docs/datahub> for complete documentation
sink:
  type: "datahub-rest"
  config:
    server: "<http://xxxxxxx:8080>"
g
For cli ingestion, using the
pip install acryl-datahub
approach should be sufficient unless you’re modifying the dbt source code
The
Failed to connect to DataHub
likely means that either your sink’s server config isn’t quite right or datahub-gms is down/not yet healthy
What does the healthcheck on GMS say?
p
how do i check GMS Healthcheck?
Also, it looks like kafka pod is now down
Good morning @gray-shoe-75895 i rebuilt the pods and now i dont see the kafka connection error anymore. GMS looks healthy. K8 gms log looks good. Question is - does the yaml look ok - as in the thread? I am still seeing connection issue
Copy code
requests.exceptions.ConnectionError: HTTPConnectionPool(host='35.222.51.134', port=8080): Max retries exceeded with url: /config (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb5be16ccd0>: Failed to establish a new connection: [Errno 60] Operation timed out'))
a
Looks like GMS isn’t healthy or the hostname isn’t correct
p
you are correct. i updated the hostname, and i am able to run dbt. now having other issues. weeding through those. Thanks for your help