Hello, a questions on configuring datahub with Kaf...
# all-things-deployment
l
Hello, a questions on configuring datahub with Kafka. • Do we need to enable
datahub-kafka-setup
job when we are connecting to our own Kafka? • What does
datahub-kafka-setup
do exactly? Since when I check the helm template it seems that all components have their own Kafka configurations. • I found that some of the
deployment.yaml
in the subchart template uses
KAFKA_PROPERTIES_{{ $configName | replace "." "_" | upper }}
but some uses
SPRING_KAFKA_PROPERTIES_{{ $configName | replace "." "_" | upper }}
. May I know the difference between this to configuration settings?
I have deployed datahub on the Kubernetes cluster and configured our own Kafka, but after I create a new ingestion source, it cannot be executed. here is the Kafka SSL Setup for reference
Copy code
credentialsAndCertsSecrets:
  name: datahub-certs
  path: /mnt/datahub-kafka/certs
  secureEnv:
    ssl.truststore.password: truststore.password
    kafkastore.ssl.truststore.password: truststore.password

springKafkaConfigurationOverrides:
  security.protocol: SASL_SSL
  sasl.mechanism: SCRAM-SHA-256
  ssl.ca.location: /mnt/datahub-kafka/certs/ca.crt
  ssl.truststore.location: /mnt/datahub-kafka/certs/truststore.jks
  kafkastore.ssl.truststore.location: /mnt/datahub-kafka/certs/truststore.jks
  sasl.jaas.config: "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"testUser\" password=\"testPassword\";"
After checking the logs I found this error in the datahub-action. Not sure if this is caused by not enabling
datahub-kafka-setup
job
Copy code
%6|1666283192.838|FAIL|rdkafka#consumer-1| [thrd:<http://kafka-1.preprod.testing.im:19092/bootstrap|kafka-1.preprod.testing.im:19092/bootstrap>]: <http://kafka-1.preprod.testing.im:19092/bootstrap|kafka-1.preprod.testing.im:19092/bootstrap>: Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 4ms in state APIVERSION_QUERY, 3 identical error(s) suppressed)
I also found this error in the
mae-consumer
Copy code
11:15:44.795 [R2 Nio Event Loop-1-1] WARN  c.l.r.t.h.c.c.ChannelPoolLifecycle - Failed to create channel, remote=datahub-datahub-gms/10.67.3.210:8080


io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: datahub-datahub-gms/10.67.3.210:8080


Caused by: java.net.ConnectException: Connection refused


	at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)


	at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)


	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)


	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)


	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707)


	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)


	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)


	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)


	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)


	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)


	at java.base/java.lang.Thread.run(Thread.java:829)


11:15:44.805 [pool-7-thread-1] ERROR c.d.m.ingestion.IngestionScheduler - Failed to retrieve ingestion sources! Skipping updating schedule cache until next refresh. start: 0, count: 30


com.linkedin.r2.RemoteInvocationException: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI <http://datahub-datahub-gms:8080/entities>


	at com.linkedin.restli.internal.client.ExceptionUtil.wrapThrowable(ExceptionUtil.java:135)


	at com.linkedin.restli.internal.client.ResponseFutureImpl.getResponseImpl(ResponseFutureImpl.java:130)


	at com.linkedin.restli.internal.client.ResponseFutureImpl.getResponse(ResponseFutureImpl.java:94)


	at com.linkedin.common.client.BaseClient.sendClientRequest(BaseClient.java:36)


	at com.linkedin.entity.client.RestliEntityClient.list(RestliEntityClient.java:361)


	at com.datahub.metadata.ingestion.IngestionScheduler$BatchRefreshSchedulesRunnable.run(IngestionScheduler.java:216)


	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)


	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)


	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)


	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)


	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)


	at java.base/java.lang.Thread.run(Thread.java:829)


Caused by: com.linkedin.r2.RemoteInvocationException: Failed to get response from server for URI <http://datahub-datahub-gms:8080/entities>


	at com.linkedin.r2.transport.http.common.HttpBridge$1.onResponse(HttpBridge.java:67)


	at com.linkedin.r2.transport.http.client.rest.ExecutionCallback.lambda$onResponse$0(ExecutionCallback.java:64)


	... 3 common frames omitted


Caused by: com.linkedin.r2.RetriableRequestException: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: datahub-datahub-gms/10.67.3.210:8080


	at com.linkedin.r2.transport.http.client.common.ChannelPoolLifecycle.onError(ChannelPoolLifecycle.java:142)


	at com.linkedin.r2.transport.http.client.common.ChannelPoolLifecycle.lambda$create$0(ChannelPoolLifecycle.java:97)


	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)


	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:571)


	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:550)


	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:491)


	at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:616)


	at io.netty.util.concurrent.DefaultPromise.setFailure0(DefaultPromise.java:609)


	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:117)

	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:321)

	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:337)


	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707)

	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)

	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)


	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)

	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)

	... 1 common frames omitted

Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: datahub-datahub-gms/10.67.3.210:8080

Caused by: java.net.ConnectException: Connection refused
	at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
	at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777)
	at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
	at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:707)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)

	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)

	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at java.base/java.lang.Thread.run(Thread.java:829)
I'm guessing this is caused by enabling standalone consumers and the pods's network needs to be configured?
b
For the last question, you can always start with gms with embedded mce and mae, I felt it makes things easier
From my experience, Kafka -setup-jobs docker is not needed if you create the related topics with schemas in your Kafka cluster beforehand
For the Kafka ssl setup, you need to refer to the application.xml , then setup your docker env accordingly
l
Thank you for the responses.
I have deployed datahub using the Helm chart provided here and followed the instructions here to set up the Kafka connection with SSL. Here are my yaml settings for Kafka
Copy code
kafka:
    bootstrap:
      server: "<http://kafka-1.preprod.test.im:19092,kafka-2.preprod.test.im:19092,kafka-3.preprod.test.im:19092|kafka-1.preprod.test.im:19092,kafka-2.preprod.test.im:19092,kafka-3.preprod.test.im:19092>"
    schemaregistry:
      url: "<http://kafka-1.preprod.test.im:8081>,<http://kafka-2.preprod.test.im:8081>,<http://kafka-3.preprod.test.im:8081>"

  credentialsAndCertsSecrets:
    name: datahub-certs
    path: /mnt/datahub-kafka/certs
    secureEnv:
      ssl.truststore.password: truststore.password

  springKafkaConfigurationOverrides:
    security.protocol: SASL_SSL
    sasl.mechanism: SCRAM-SHA-256
    ssl.truststore.location: /mnt/datahub-kafka/certs/truststore.jks
    ssl.ca.location: /mnt/datahub-kafka/certs/ca.crt
    sasl.jaas.config: "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"testuser\" password=\"testPass\";"
But the
datahub-kafka-setup-job
is unable to run and failing with the following errors
Copy code
Caused by: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Am I missing something in the SSL configuration? Thanks
b
since you are using self-signed ssl certificate, you will need to add your root cert into java keystore so the ssl cert can be trusted.
something like this
Copy code
sudo "$JAVA_HOME/bin/keytool" -import -keystore "$JAVA_HOME/jre/lib/security/cacerts" -storepass changeit -noprompt -file YOUR-ROOT.cer -alias ANYNAME-YOU-WANT
it also means, you will need to add this content into gms's docker file
l
but I thought by using the Helm chart I was passing the SSL credentials by setting up the environment variables for Spring Boot