Hi Team We are trying to install datahub on EKS. 1...
# all-things-deployment
g
Hi Team We are trying to install datahub on EKS. 1. elasticsearch setup job ends seccesfully (pod status is completed). But we are seeing the following error in the end. We are using AWS Elastic search. Is this error okay?
Copy code
2022/08/29 18:07:17 Waiting for: https://<redacted>:443
2022/08/29 18:07:17 Received 200 from https://<redacted>:443

datahub_usage_event_policy exists

creating datahub_usage_event_index_template
{
  "index_patterns": ["*datahub_usage_event*"],
  "data_stream": { },
  "priority": 500,
  "template": {
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "type": {
          "type": "keyword"
        },
        "timestamp": {
          "type": "date"
        },
        "userAgent": {
          "type": "keyword"
        },
        "browserId": {
          "type": "keyword"
        }
      }
    },
    "settings": {
      "index.lifecycle.name": "datahub_usage_event_policy"
    }
  }
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  1167  100   647  100   520  27284  21928 --:--:-- --:--:-- --:--:-- 50739
2022/08/29 18:07:17 Command finished successfully.
}{"error":{"root_cause":[{"type":"invalid_index_template_exception","reason":"index_template [datahub_usage_event_index_template] invalid, cause [Validation Failed: 1: unknown setting [index.lifecycle.name] please check that any required plugins are installed, or check the breaking changes documentation for removed settings;]"}],"type":"invalid_index_template_exception","reason":"index_template [datahub_usage_event_index_template] invalid, cause [Validation Failed: 1: unknown setting [index.lifecycle.name] please check that any required plugins are installed, or check the breaking changes documentation for removed settings;]"},"status":400}%
2. kafka setup job is failing with these errors. We are using TLS endpoints for MSK bootstrap servers.
Copy code
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 6.1.4-ccs
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: c9124241a6ff43bc
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1661796471985
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.utils.AppInfoParser - App info kafka.admin.client for adminclient-1 unregistered
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.clients.admin.internals.AdminMetadataManager - [AdminClient clientId=adminclient-1] Metadata update failed
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1661796532063, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1661796532063, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited. Call: listNodes
org.apache.kafka.common.errors.TimeoutException: Call(callName=fetchMetadata, deadlineMs=1661796502062, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send the call. Call: fetchMetadata
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.metrics.Metrics - Metrics scheduler closed
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.metrics.Metrics - Closing reporter org.apache.kafka.common.metrics.JmxReporter
[kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.metrics.Metrics - Metrics reporters closed
[kafka-admin-client-thread | adminclient-1] ERROR org.apache.kafka.common.utils.KafkaThread - Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1':
java.lang.OutOfMemoryError: Java heap space
	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
	at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
	at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
	at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:113)
	at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:447)
	at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:397)
	at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:674)
	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:576)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:563)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1329)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1260)
	at java.lang.Thread.run(Thread.java:750)
[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1661796532062, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1661796532062, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited.
Can someone help in above errors?
For 1. I found other people also encountered it https://github.com/datahub-project/datahub/issues/5746
For 2. I added following in global values of helm chart
Copy code
springKafkaConfigurationOverrides:
    security.protocol: SSL
It worked but now there is another error
Copy code
Error while executing config command with args '--command-config /tmp/connection.properties --bootstrap-server <http://b-3.testdatahub.kbb3mi.c14.kafka.us-west-2.amazonaws.com:9094,b-2.testdatahub.kbb3mi.c14.kafka.us-west-2.amazonaws.com:9094,b-1.testdatahub.kbb3mi.c14.kafka.us-west-2.amazonaws.com:9094|b-3.testdatahub.kbb3mi.c14.kafka.us-west-2.amazonaws.com:9094,b-2.testdatahub.kbb3mi.c14.kafka.us-west-2.amazonaws.com:9094,b-1.testdatahub.kbb3mi.c14.kafka.us-west-2.amazonaws.com:9094> --entity-type topics --entity-name _schemas --alter --add-config cleanup.policy=compact'
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.UnknownTopicOrPartitionException: 
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:104)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:272)
	at kafka.admin.ConfigCommand$.getResourceConfig(ConfigCommand.scala:552)
	at kafka.admin.ConfigCommand$.alterConfig(ConfigCommand.scala:322)
	at kafka.admin.ConfigCommand$.processCommand(ConfigCommand.scala:302)
	at kafka.admin.ConfigCommand$.main(ConfigCommand.scala:97)
	at kafka.admin.ConfigCommand.main(ConfigCommand.scala)
Caused by: org.apache.kafka.common.errors.UnknownTopicOrPartitionException:
b
have you deployed schema-registry for kafka ?
if you are using aws opensearch managed service for elasticsearch , please setup below environment variable - https://datahubproject.io/docs/deploy/aws#:~:text=extraEnvs%3A%0A%20%20%20%20%20%20%2D%20name[…]ARCH%0A%20%20%20%20%20%20%20%20value%3A%20%22true%22
g
Yea looks like I missed installing prerequisites. I had setup AWS Ealastic search, AWS managed kafaka and AWS RDS. I thought only these services were needed for prerequisites. Working to install cp-schema-registry now
b
make sure you disabled elasticsearch, mysql, and kafka since you are using aws managed services for these components pre-requisite values.yaml
g
Yes, I have done that But I still have doubt My pre-requisites values looks like this now
Copy code
elasticsearch:
  enabled: false
neo4j:
  enabled: false

neo4j-community:
  enabled: false

mysql:
  enabled: false

cp-helm-charts:
  # Schema registry is under the community license
  cp-schema-registry:
    enabled: true
    kafka:
      bootstrapServers: "b-3.<redacted>.<http://us-west-2.amazonaws.com:9094,b-2.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094,b-1.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094|us-west-2.amazonaws.com:9094,b-2.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094,b-1.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094>"
  cp-kafka:
    enabled: false
  cp-zookeeper:
    enabled: false
  cp-kafka-rest:
    enabled: false
  cp-kafka-connect:
    enabled: false
  cp-ksql-server:
    enabled: false
  cp-control-center:
    enabled: false

kafka:
  enabled: false
My AWS MSK brokers only supports SSL. And this configuration does not support any variable to pass SSL in cp-schema-registry.kafka So I have doubt if it will work. I am going to try it anyway
b
rightly said, you need to add config for ssl under cp-schema-registry like below
Copy code
cp-schema-registry:
  enabled: true
  kafka:
    bootstrapServers: b-3.<redacted>.<http://us-west-2.amazonaws.com:9094,b-2.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094,b-1.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094|us-west-2.amazonaws.com:9094,b-2.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094,b-1.<redacted>.c14.kafka.us-west-2.amazonaws.com:9094>
  sslSecrets:
    name: ssl-config
    secureEnv:
      kafkastore.ssl.keystore.password: keystore_password
      kafkastore.ssl.key.password: keystore_password
      kafkastore.ssl.truststore.password: truststore_password
  configurationOverrides:
    kafkastore.security.protocol: SSL
    kafkastore.ssl.keystore.location: /mnt/certs/keystore
    kafkastore.ssl.truststore.location: /mnt/certs/truststore
g
okay let me try Thank you so much for helping me out in this
l
Hi @great-branch-515! Gentle reminder to please follow our Slack guidelines & post large blocks of code/stack trace in message threads - it’s a HUGE help for us to keep track of unanswered questions across our various support channels! teamwork
g
Sure Maggie I will keep that in mind
f
@numerous-autumn-22862 this might be relevant for us
n
This looks promising
r
@great-branch-515 did the above configuration work for you?