https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • p

    prehistoric-room-17640

    02/14/2022, 2:10 PM
    (through search)
  • p

    prehistoric-room-17640

    02/14/2022, 2:14 PM
    It must be related to elasticsearch index but I don't see any exceptions in the GMS pod. just this warning.
    Copy code
    14:10:14.862 [Thread-3020] WARN  org.elasticsearch.client.RestClient:65 - request [POST <http://elasticsearch-master:9200/*index_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 2 warnings: [299 Elasticsearch-7.16.2-2b937c44140b6559905130a8650c64dbd0879cfb "Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See <https://www.elastic.co/guide/en/elasticsearch/reference/7.16/security-minimal-setup.html> to enable security."],[299 Elasticsearch-7.16.2-2b937c44140b6559905130a8650c64dbd0879cfb "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
  • b

    brave-businessperson-3969

    02/14/2022, 8:37 PM
    Hi everyone, I have a question concerning ingestion: We deployed DataHub on an OpenShift cluster for testing purposes. Pods look fine as far as I can tell and the frontend is accessible via webbrowser. However, it is not possible to ingest data somehow. From the pod which performs the ingestion (self-build) the gms service should be reachable (wget http://datahub-gms:8080/config returns a json file) but when running datahub ingest after like 30 or 40 seconds I get the following warning few times warning and then datahub ingest just exits: WARNING {urllib3.connectionpool:810} - Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ProxyError('Cannot connect to proxy.', NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc756d78400>: Failed to establish a new connection: [Errno 110] Connection timed out'))': http://datahub-gms:8080/config Any idea what could cause this error? (From the ingestion pod/container, currently only the gms pod is reachable, as a sink we use datahub-rest)
  • r

    rich-policeman-92383

    02/17/2022, 7:29 AM
    Hi Using the restli api how do i define ownership of a group. What will be the exact payload to define group ownership. https://datahubproject.io/docs/metadata-service/#get-a-corpgroup
    👀 1
    plus1 1
  • d

    damp-minister-31834

    02/18/2022, 3:40 AM
    What rest api should I call?
  • g

    gifted-piano-21322

    02/21/2022, 10:13 AM
    Awesome, thanks!
  • b

    broad-thailand-41358

    02/22/2022, 6:49 PM
    No ideas? this is getting quite frustrating
  • a

    able-rain-74449

    03/01/2022, 2:08 PM
    Hi All i am gettting error when i deploy
    prerequisites-cp-schema-registry
    not sure if that's Kafka not connecting.
    Copy code
    ➜  01pre-req kubectl logs datahub-prerequisites-cp-schema-registry-65d8777cc8-m88mn cp-schema-registry-server
    ===> User
    uid=1000(appuser) gid=1000(appuser) groups=1000(appuser)
    ===> Configuring ...
    ===> Running preflight checks ... 
    ===> Check if Kafka is healthy ...
    [main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values: 
            bootstrap.servers = [z-1.datahub-demo-cluster-......................OMITTED:9092]
            client.dns.lookup = use_all_dns_ips
            client.id = 
            <http://connections.max.idle.ms|connections.max.idle.ms> = 300000
            <http://default.api.timeout.ms|default.api.timeout.ms> = 60000
            <http://metadata.max.age.ms|metadata.max.age.ms> = 300000
            metric.reporters = []
            metrics.num.samples = 2
            metrics.recording.level = INFO
            <http://metrics.sample.window.ms|metrics.sample.window.ms> = 30000
            receive.buffer.bytes = 65536
            <http://reconnect.backoff.max.ms|reconnect.backoff.max.ms> = 1000
            <http://reconnect.backoff.ms|reconnect.backoff.ms> = 50
            <http://request.timeout.ms|request.timeout.ms> = 30000
            retries = 2147483647
            <http://retry.backoff.ms|retry.backoff.ms> = 100
            sasl.client.callback.handler.class = null
            sasl.jaas.config = null
            sasl.kerberos.kinit.cmd = /usr/bin/kinit
            sasl.kerberos.min.time.before.relogin = 60000
            sasl.kerberos.service.name = null
            sasl.kerberos.ticket.renew.jitter = 0.05
            sasl.kerberos.ticket.renew.window.factor = 0.8
            sasl.login.callback.handler.class = null
            sasl.login.class = null
            sasl.login.refresh.buffer.seconds = 300
            sasl.login.refresh.min.period.seconds = 60
            sasl.login.refresh.window.factor = 0.8
            sasl.login.refresh.window.jitter = 0.05
            sasl.mechanism = GSSAPI
            security.protocol = PLAINTEXT
            security.providers = null
            send.buffer.bytes = 131072
            <http://socket.connection.setup.timeout.max.ms|socket.connection.setup.timeout.max.ms> = 127000
            <http://socket.connection.setup.timeout.ms|socket.connection.setup.timeout.ms> = 10000
            ssl.cipher.suites = null
            ssl.enabled.protocols = [TLSv1.2, TLSv1.3]
            ssl.endpoint.identification.algorithm = https
            ssl.engine.factory.class = null
            ssl.key.password = null
            ssl.keymanager.algorithm = SunX509
            ssl.keystore.certificate.chain = null
            ssl.keystore.key = null
            ssl.keystore.location = null
            ssl.keystore.password = null
            ssl.keystore.type = JKS
            ssl.protocol = TLSv1.3
            ssl.provider = null
            ssl.secure.random.implementation = null
            ssl.trustmanager.algorithm = PKIX
            ssl.truststore.certificates = null
            ssl.truststore.location = null
            ssl.truststore.password = null
            ssl.truststore.type = JKS
    
    [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 6.1.0-ccs
    [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: 5496d92defc9bbe4
    [main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1646143502378
    [kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
    org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1646143532389, tries=1, nextAllowedTryMs=1646143532490) timed out at 1646143532390 after 1 attempt(s)
    Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: fetchMetadata
    [main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
    java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1646143542388, tries=1, nextAllowedTryMs=1646143542489) timed out at 1646143542389 after 1 attempt(s)
            at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
            at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
            at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
            at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
            at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
            at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
    Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1646143542388, tries=1, nextAllowedTryMs=1646143542489) timed out at 1646143542389 after 1 attempt(s)
    Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listNodes
    [main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
    [main] ERROR io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Brokers found [].
  • a

    able-rain-74449

    03/01/2022, 2:09 PM
    any help would be great BTW: i have converted helm into yaml.
  • m

    miniature-account-72792

    03/01/2022, 2:39 PM
    Have you set the correct bootstrap server in the
    values.yaml
    of the prerequisites?
  • a

    able-rain-74449

    03/01/2022, 2:42 PM
    i am not using helm
  • a

    able-rain-74449

    03/01/2022, 2:43 PM
    so my deployment looks like
    Copy code
    ---
    # Source: datahub-prerequisites/charts/cp-helm-charts/charts/cp-schema-registry/templates/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: datahub-prerequisites-cp-schema-registry
      namespace: datahub
      labels:
        app: cp-schema-registry
        chart: cp-schema-registry-0.1.0
        release: datahub-prerequisites
        heritage: Helm
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: cp-schema-registry
          release: datahub-prerequisites
      template:
        metadata:
          labels:
            app: cp-schema-registry
            release: datahub-prerequisites
          annotations:
            <http://prometheus.io/scrape|prometheus.io/scrape>: "true"
            <http://prometheus.io/port|prometheus.io/port>: "5556"
        spec:
          containers:
            - name: prometheus-jmx-exporter
              image: "solsson/kafka-prometheus-jmx-exporter@sha256:6f82e2b0464f50da8104acd7363fb9b995001ddff77d248379f8788e78946143"
              imagePullPolicy: "IfNotPresent"
              command:
              - java
              - -XX:+UnlockExperimentalVMOptions
              - -XX:+UseCGroupMemoryLimitForHeap
              - -XX:MaxRAMFraction=1
              - -XshowSettings:vm
              - -jar
              - jmx_prometheus_httpserver.jar
              - "5556"
              - /etc/jmx-schema-registry/jmx-schema-registry-prometheus.yml
              ports:
              - containerPort: 5556
              resources:
                {}
              volumeMounts:
              - name: jmx-config
                mountPath: /etc/jmx-schema-registry
            - name: cp-schema-registry-server
              image: "confluentinc/cp-schema-registry:6.1.0"
              imagePullPolicy: "IfNotPresent"
              ports:
                - name: schema-registry
                  containerPort: 8081
                  protocol: TCP
                - containerPort: 5555
                  name: jmx
              resources:
                {}
              env:
              - name: SCHEMA_REGISTRY_HOST_NAME
                valueFrom:
                  fieldRef:
                    fieldPath: status.podIP
              - name: SCHEMA_REGISTRY_LISTENERS
                value: <http://0.0.0.0:8081>
              - name: SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS
                value: z-1.datahub-demo-cluster-1..............OMITTEED..........:9092 #:9092
              - name: SCHEMA_REGISTRY_KAFKASTORE_GROUP_ID
                value: datahub-prerequisites
              - name: SCHEMA_REGISTRY_MASTER_ELIGIBILITY
                value: "true"
              - name: SCHEMA_REGISTRY_HEAP_OPTS
                value: "-Xms512M -Xmx512M"
              - name: JMX_PORT
                value: "5555"
          volumes:
          - name: jmx-config
            configMap:
              name: datahub-prerequisites-cp-schema-registry-jmx-configmap
  • a

    able-rain-74449

    03/01/2022, 2:44 PM
    the configumape.yaml
    Copy code
    ---
    # Source: datahub-prerequisites/charts/cp-helm-charts/charts/cp-schema-registry/templates/jmx-configmap.yaml
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: datahub-prerequisites-cp-schema-registry-jmx-configmap
      namespace: datahub
      labels:
        app: cp-schema-registry
        chart: cp-schema-registry-0.1.0
        release: datahub-prerequisites
        heritage: Helm
    data:
      jmx-schema-registry-prometheus.yml: |+
        jmxUrl: service:jmx:rmi:///jndi/<rmi://localhost:5555/jmxrmi>
        lowercaseOutputName: true
        lowercaseOutputLabelNames: true
        ssl: false
        whitelistObjectNames:
        - kafka.schema.registry:type=jetty-metrics
        - kafka.schema.registry:type=master-slave-role
        - kafka.schema.registry:type=jersey-metrics
        rules:
        - pattern : 'kafka.schema.registry<type=jetty-metrics>([^:]+):'
          name: "cp_kafka_schema_registry_jetty_metrics_$1"
        - pattern : 'kafka.schema.registry<type=master-slave-role>([^:]+):'
          name: "cp_kafka_schema_registry_master_slave_role"
        - pattern : 'kafka.schema.registry<type=jersey-metrics>([^:]+):'
          name: "cp_kafka_schema_registry_jersey_metrics_$1"
  • a

    able-rain-74449

    03/01/2022, 2:45 PM
    and the service.yaml
    Copy code
    ---
    # Source: datahub-prerequisites/charts/cp-helm-charts/charts/cp-schema-registry/templates/service.yaml
    apiVersion: v1
    kind: Service
    metadata:
      name: datahub-prerequisites-cp-schema-registry
      namespace: datahub
      labels:
        app: cp-schema-registry
        chart: cp-schema-registry-0.1.0
        release: datahub-prerequisites
        heritage: Helm
    spec:
      ports:
        - name: schema-registry
          port: 8081
        - name: metrics
          port: 5556
      selector:
        app: cp-schema-registry
        release: datahub-prerequisites
  • a

    able-rain-74449

    03/01/2022, 2:49 PM
    also
    datahub-elasticsearch-master-2
    not ready 🤔
  • r

    red-napkin-59945

    03/01/2022, 9:30 PM
    hey team, I would like to check what the status is of the "Long Term" items described here
  • m

    miniature-account-72792

    03/02/2022, 7:12 AM
    I also saw that my
    datahub-upgrade-job
    is failing with the following error
    Copy code
    Cannot connect to GMSat host datahub-datahub-gms port 8080. Make sure GMS is on the latest version and is running at that host before starting the migration.
    Is this also related to the fact that I use certificates?
  • b

    bland-orange-95847

    03/02/2022, 9:50 AM
    Just found this thread and I have the same issue like @numerous-application-54063 with bigquery. The first run is working and checkpoint gets created but in the second run it cannot read the checkpoint and fails with
    Message: “Failed to construct checkpoint’s config from checkpoint aspect.”
    Arguments: (ConfigurationError(‘BigQuery project ids are globally unique. You do not need to specify a platform instance.’),)
    I think there is something different with the platform instances as they are not supported by bigquery source
  • r

    red-napkin-59945

    03/03/2022, 5:25 PM
    I would like to know what is
    FACET_FIELDS
  • r

    rhythmic-bear-20384

    03/04/2022, 5:17 AM
    The datahub actions container seems to get killed after a while when using the quick start. This is leading to ingestion being non responsive. The logs from actions container show that the health check URL is unreachable. The datahbub-gms container is up and running and I verified that the actions container is part of the datahub-network. Any ideas on what is happening and suggestions on fixes?
  • g

    gorgeous-dinner-4055

    03/16/2022, 6:00 AM
    Sorry to revive this old thread, but could you clarify 2 John? In the GraphQLEntityResolver I am seeing: https://github.com/datahub-project/datahub/blob/55357783f330950408e4624b3f1421594c[…]rc/main/java/com/linkedin/datahub/graphql/GmsGraphQLEngine.java Which is used for the autocomplete feature: https://github.com/datahub-project/datahub/blob/55357783f330950408e4624b3f1421594c[…]rc/main/java/com/linkedin/datahub/graphql/GmsGraphQLEngine.java Without that turned on, I'm unable to make a new entity show up in search. in the UI below, the graphQL call for autocomplete the
    getAutoCompleteMultipleResults
    function is called, and the searchable types is registered for autocomplete:
    Copy code
    .dataFetcher("autoCompleteForMultiple", new AuthenticatedResolver<>(
                        new AutoCompleteForMultipleResolver(searchableTypes)))
  • e

    early-midnight-66457

    03/16/2022, 7:58 AM
    i am facing this error after running an app for almost an hour.
  • e

    early-midnight-66457

    03/16/2022, 7:59 AM
    app is trying to create a new thread and is unable to do so.
  • e

    early-midnight-66457

    03/16/2022, 7:59 AM
    any suggestions would be helpful
  • f

    fierce-author-36990

    03/16/2022, 10:10 AM
    1647425391(1).jpg,B2V1IUT~5~YSWDE%DX1G{53.png
  • h

    high-family-71209

    03/18/2022, 12:15 PM
    This seems like an unsolved thing for quickstart. There seems to be something like a race condition where the zookeeper doesn't come up in time for the kafka-setup.
  • l

    little-salesmen-55578

    03/23/2022, 4:59 PM
    Can anyone help debug this? I am out of ideas now 🙂
    👀 1
  • b

    bulky-intern-2942

    03/30/2022, 7:40 PM
    Hi Pedro, Okay, I´ve just deleted the message posted in the other channel. I´m gonna downgrade the cluster version and retry the installation proccess. Thanks.
  • s

    sticky-dawn-95000

    04/01/2022, 7:22 AM
    I tried to run DataHub using CLI command ‘datahub docker quickstart’, but I got the error like bellow:
  • b

    brief-businessperson-12356

    04/04/2022, 11:12 AM
    Finally managed to get this working! I made two small changes which seemed to do the trick! 1. Created a new java truststore that contained just the CA for mkcert 2. Created a configmap from that new truststore:
    Copy code
    kubectl create configmap truststore-configmap --from-file=newTruststore
    🎉 3
1...110111112...119Latest