Title
e

elegant-evening-28502

07/14/2022, 5:47 PM
Hi All, I'm trying to setup DataHub on AWS EKS using helm charts, using AWS managed services for the storage. Having issue with the
kafkaSetupJob
, getting below error
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.utils.AppInfoParser - App info kafka.admin.client for adminclient-1 unregistered
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1657820045949, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: fetchMetadata
kafkaSetupJob:
  enabled: true
  tolerations:
    - key: "app.lucid.cloud/tolerate-app"
      operator: "Equal"
      value: "generic"
      effect: "NoSchedule"
  image:
    repository: linkedin/datahub-kafka-setup
    tag: "v0.8.40"
  serviceAccountName: datahub
passing the serviceaccount name but looks like it's ignored, when looked at the pod settings it set to default
restartPolicy: Never
  terminationGracePeriodSeconds: 30
  dnsPolicy: ClusterFirst
  serviceAccountName: default
  serviceAccount: default
  nodeName: ip-*
i

incalculable-ocean-74010

07/14/2022, 7:16 PM
Hello Vamshi, KafkaSetupJob connects to AWS via the
global.kafka.bootstrap.server
properties. See the default values.yaml as an example: https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/values.yaml
e

elegant-evening-28502

07/14/2022, 7:22 PM
Hi Pedro! ya I updated the values in the values.yaml file
kafka:
    enabled: true
    bootstrap:
      server: "${datahub_kafka_endpoint}"
    zookeeper:
      server: "${datahub_zookeeper_endpoint}"
    partitions: 3
    replicationFactor: 3
    schemaregistry:
      type: AWS_GLUE
      glue:
        region: us-east-1
        registry: "${datahub_glue_registry}"
Created MSK cluster with version
3.2.0
and enabled authentication via IAM
i

incalculable-ocean-74010

07/14/2022, 7:27 PM
Bear in mind that DataHub has not been tested with kafka v3 as far as I’m aware
👍 1
"${datahub_kafka_endpoint}"
Is this pointing to the non-ssl broker ips?
e

elegant-evening-28502

07/14/2022, 7:30 PM
yes it's pointing to a private endpoint
<http://b-2.devdatahub.555j12.c22.kafka.us-east-1.amazonaws.com:9098|b-2.devdatahub.555j12.c22.kafka.us-east-1.amazonaws.com:9098>
will try creating new MSK cluster with v 2.6.2
how to pass serviceaccount to the
kafkaSetupJob
?
at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1657830804456, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited. Call: listNodes
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1657830774455, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: fetchMetadata
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.metrics.Metrics - Metrics scheduler closed
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.metrics.Metrics - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.metrics.Metrics - Metrics reporters closed
[kafka-admin-client-thread | adminclient-1] ERROR org.apache.kafka.common.utils.KafkaThread - Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1':
java.lang.OutOfMemoryError: Java heap space
	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
	at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
	at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
	at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:113)
	at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:447)
	at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:397)
	at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:674)
	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:576)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:563)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1329)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1260)
	at java.lang.Thread.run(Thread.java:750)
[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
Created MSK cluster with V2.6.2, got same error^
i

incalculable-ocean-74010

07/14/2022, 9:21 PM
I don’t think the helm chart is prepared for service accounts
And I don’t know how service accounts work in K8s
Is your private endpoint ssl enabled? Usually non-ssl brokers are on port 9092
e

elegant-evening-28502

07/14/2022, 9:41 PM
sorry I'm new to Kafka, please bare with me. I have the authentication type set to
IAM
, and encryption is enabled
when changed the endpoint to
"<http://b-2.devdatahub.kpyyyn.c22.kafka.us-east-1.amazonaws.com:9092|b-2.devdatahub.kpyyyn.c22.kafka.us-east-1.amazonaws.com:9092>"
getting below error
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1657834223859
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1657834253864, tries=1, nextAllowedTryMs=1657834253965) timed out at 1657834253865 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: fetchMetadata
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1657834283865, tries=1, nextAllowedTryMs=1657834283966) timed out at 1657834283866 after 1 attempt(s)
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1657834283865, tries=1, nextAllowedTryMs=1657834283966) timed out at 1657834283866 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: listNodes
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1657834283865, tries=1, nextAllowedTryMs=1657834283966) timed out at 1657834283866 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: fetchMetadata
[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
[main] ERROR io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Brokers found [].
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Error while executing topic command : Call(callName=createTopics, deadlineMs=1657834349048, tries=1, nextAllowedTryMs=1657834349150) timed out at 1657834349050 after 1 attempt(s)
[2022-07-14 21:32:29,052] ERROR org.apache.kafka.common.errors.TimeoutException: Call(callName=createTopics, deadlineMs=1657834349048, tries=1, nextAllowedTryMs=1657834349150) timed out at 1657834349050 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: createTopics
 (kafka.admin.TopicCommand$)
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
i

incalculable-ocean-74010

07/14/2022, 9:51 PM
Is the kubernetes cluster in the same VPC as the kafka cluster? Can they communicate with one another?
If you launch a pod in kubernetes and try to ping kafka, are you able to?
Do you get a response?
e

elegant-evening-28502

07/14/2022, 10:03 PM
ya kuberentes cluster and kafak cluster are on same vpc
The following list provides the numbers of the ports that Amazon MSK uses to communicate with client machines.

To communicate with brokers in plaintext, use port 9092.

To communicate with brokers by using TLS encryption, use port 9094 for access from within AWS and port 9194 for public access.

To communicate with brokers by using SASL/SCRAM, use port is 9096 for access from within AWS and port 9196 for public access.

To communicate with brokers in a cluster that is set up to use IAM access control, use port 9098 for access from within AWS and port 9198 for public access.

Apache ZooKeeper nodes use port 2181 by default. To communicate with Apache ZooKeeper by using TLS encryption, use port 2182.
can't ping kafka from the pod
created MSK with no authentication, was able to connect to the cluster
Thank You for looking into it Pedro!