Hello, Looks like lot of people deployed DataHub o...
# all-things-deployment
b
Hello, Looks like lot of people deployed DataHub on EKS already. I am in the process of doing the same. I am having an issue with
kafka-setup-job
pod when configuring Datahub to work with SSL AWS MSK. Is there an example of how to create
datahub-certs
secret on K8s? I found one for creating mysql and elasticsearch secrets in Datahub docs but not for SSL secrets for Kafka. If anyone could give me a sample snippet of how to create
datahub-certs
secret would help a lot. Thanks in advance!
plus1 1
d
Also experiencing this right now, help would be much appreciated!
b
@early-lamp-41924 Can you point these kind folks in the right direction ?
e
Hi folks. Here is a script we use to create the secret. See if this is applicable in your ecosystem as well!
Copy code
TRUSTSTORE_PASSWORD=$(pwgen -s -1 14)
  KEYSTORE_PASSWORD=$(pwgen -s -1 14)
  cp ${JAVA_HOME}/lib/security/cacerts kafka.client.truststore.jks
  keytool -storepasswd -keystore kafka.client.truststore.jks -storepass changeit -new ${TRUSTSTORE_PASSWORD}
  keytool -genkey -keystore kafka.client.keystore.jks -validity 300 -storepass ${KEYSTORE_PASSWORD} -dname "CN=${namespace}" -alias ${namespace} -storetype pkcs12
  keytool -keystore kafka.client.keystore.jks -certreq -file client-cert-sign-request -alias ${namespace} -storepass ${KEYSTORE_PASSWORD} -keypass ${KEYSTORE_PASSWORD}
  sed -i -e "s/NEW CERTIFICATE REQUEST/CERTIFICATE REQUEST/g" client-cert-sign-request
  PRIVATE_CA_ARN=$(aws ssm get-parameters --region us-west-2 --name "/private-ca/root/arn" --with-decryption --no-cli-pager --query "Parameters[*].{Value:Value}" --output text | tr -d "\n")
  CERTIFICATE_ARN=$(aws acm-pca issue-certificate --region us-west-2 --certificate-authority-arn ${PRIVATE_CA_ARN} --csr <fileb://client-cert-sign-request> --signing-algorithm "SHA256WITHRSA" --validity Value=300,Type="DAYS" | jq -r '.CertificateArn')
  sleep 10 ## Sleep to make sure the issue certificate command has finished
  aws acm-pca get-certificate --region us-west-2 --certificate-authority-arn ${PRIVATE_CA_ARN} --certificate-arn ${CERTIFICATE_ARN} | jq -r '[.Certificate, .CertificateChain] | join("\n")' > signed-certificate-from-acm
  keytool -keystore kafka.client.keystore.jks -import -file signed-certificate-from-acm -alias ${namespace} -storepass ${KEYSTORE_PASSWORD} -keypass ${KEYSTORE_PASSWORD} -noprompt
  kubectl create secret generic ssl-config --from-file=keystore=./kafka.client.keystore.jks --from-file=truststore=./kafka.client.truststore.jks --from-literal=keystore_password=$KEYSTORE_PASSWORD --from-literal=truststore_password=$TRUSTSTORE_PASSWORD --namespace ${namespace}
After doing ^ you can set the following in the values.yaml
Copy code
global:
  ...
  credentialsAndCertsSecrets:
    name: ssl-config
    secureEnv:
      ssl.keystore.password: keystore_password
      ssl.key.password: keystore_password
      ssl.truststore.password: truststore_password

  springKafkaConfigurationOverrides:
    security.protocol: SSL
    ssl.keystore.location: /mnt/certs/keystore
    ssl.truststore.location: /mnt/certs/truststore
b
Thanks for the above @early-lamp-41924 ! I am trying to use AWS MSK for Kafka and having issues with
kafka-setup-job
pod. It doesnt seem to pickup these(SASL and IAM config) values as mentioned here - https://datahubproject.io/docs/how/kafka-config#kafka
e
Ah so for that
b
Was going through this - https://github.com/datahub-project/datahub/tree/master/docker/kafka-setup and I feel I need to build a custom docker image for this
kafka-setup-job
to pickup these new env variables
e
Here is the script that is running https://github.com/datahub-project/datahub/blob/master/docker/kafka-setup/kafka-setup.sh Feel free to create a PR to add in the parameters you need
Ideally, we should get all KAFKA_PROPERTIES_* env but we are not right now. Definitely in our backlog
d
throwing those variables into this will pick it up for example
Copy code
springKafkaConfigurationOverrides:
  security.protocol: SSL
b
Do we need to do this for all the components connecting to Kafka?
e
No. Just kafka-setup
d
it seems like the helm templates pick it up
e
For GMS / consumers, we use spring kafka library, which covers most of the kafka env variables already
b
Yep I tried adding those and the kafka-setup script is not picking those up but kafka-pod picks them up from Helm
Copy code
springKafkaConfigurationOverrides:
    ssl.keystore.location: /mnt/datahub/certs/datahub.linkedin.com.keystore.jks
    ssl.truststore.location: /mnt/datahub/certs/datahub.linkedin.com.truststore.jks
    kafkastore.ssl.truststore.location: /mnt/datahub/certs/datahub.linkedin.com.truststore.jks
    security.protocol: SASL_SSL
    sasl.mechanism: AWS_MSK_IAM
    sasl.jaas.config: software.amazon.msk.auth.iam.IAMLoginModule required;
    sasl.client.callback.handler.class: software.amazon.msk.auth.iam.IAMClientCallbackHandler
    kafkastore.security.protocol: SSL
    ssl.keystore.type: JKS
    ssl.truststore.type: JKS
    ssl.protocol: TLS
    ssl.endpoint.identification.algorithm:
b
😞 So what are the implications here? If you are using SSL you cannot properly create the topics?
e
I think we are getting mixed responses here, which is confusing the conversation
b
When you describe the pod -
Copy code
Environment:                                                                                                                                                                                                                                           
β”‚       KAFKA_ZOOKEEPER_CONNECT:                                 <redacted>                                                                                                                                                                                                                                                    
β”‚       KAFKA_BOOTSTRAP_SERVER:                                  <redacted>                                                           β”‚
β”‚       KAFKA_PROPERTIES_KAFKASTORE_SECURITY_PROTOCOL:           SSL                                                                                                                                                                                         
β”‚       KAFKA_PROPERTIES_KAFKASTORE_SSL_TRUSTSTORE_LOCATION:     /mnt/datahub/certs/datahub.linkedin.com.truststore.jks                                                                                                                                      
β”‚       KAFKA_PROPERTIES_SASL_CLIENT_CALLBACK_HANDLER_CLASS:     software.amazon.msk.auth.iam.IAMClientCallbackHandler                                                                                                                                       
β”‚       KAFKA_PROPERTIES_SASL_JAAS_CONFIG:                       software.amazon.msk.auth.iam.IAMLoginModule required;                                                                                                                                       
β”‚       KAFKA_PROPERTIES_SASL_MECHANISM:                         AWS_MSK_IAM                                                                                                                                                                                 
β”‚       KAFKA_PROPERTIES_SECURITY_PROTOCOL:                      SASL_SSL                                                                                                                                                                                    
β”‚       KAFKA_PROPERTIES_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM:                                                                                                                                                                                              
β”‚       KAFKA_PROPERTIES_SSL_KEYSTORE_LOCATION:                  /mnt/datahub/certs/datahub.linkedin.com.keystore.jks                                                                                                                                        
β”‚       KAFKA_PROPERTIES_SSL_KEYSTORE_TYPE:                      JKS                                                                                                                                                                                         
β”‚       KAFKA_PROPERTIES_SSL_PROTOCOL:                           TLS                                                                                                                                                                                         
β”‚       KAFKA_PROPERTIES_SSL_TRUSTSTORE_LOCATION:                /mnt/datahub/certs/datahub.linkedin.com.truststore.jks                                                                                                                                      
β”‚       KAFKA_PROPERTIES_SSL_TRUSTSTORE_TYPE:                    JKS                                                                                                                                                                                         
β”‚       PARTITIONS:                                              2                                                                                                                                                                                           
β”‚       REPLICATION_FACTOR:                                      2
e
Yes helm tries to set KAFKA_PROPERTIES_* on the kafka-setup-job based on the values.yaml above, but it does not get picked up if it is not being read on the script
b
Yep was just clarifying this theory. Thanks!
d
Continuing this Thread, What would we do if we arent using ACM with AWS MSK but still want to use TLS/SSL with Kafka. What would be the settings/configuration there?
b
I am curious to know about this too, for further troubleshooting SSL issues I might face. Please let us know once you get a chance. Thanks!
d
main reasoning behind that is creating a certificate in ACM is $400 a month
plus1 1
seems a little heavy for something that should be a simple task
e
Oh that was just an example
Assuming you are using SASL. Seems like you need to follow a different procesS?
d
still want to use SSL, just not have to provide a Cert
e
The above script I shared uses this to authenticate https://docs.aws.amazon.com/msk/latest/developerguide/msk-authentication.html
d
fwiw, when i just set
Copy code
springKafkaConfigurationOverrides:   
  security.protocol: SSL
  kafkastore.security.protocol: SSL
  ssl.protocol: TLS
  ssl.endpoint.identification.algorithm: ""
the kafka_setup job complains that the keystore and truststore files cant be found, even though they are set to empty
e
Hmn. I can’t seem to find any docs on creating client auth without ACM for SSL
seems like if its set to TLS, it will use the AWS Public ACM
e
That is for rest
likely regarding rest proxy
As you can see in the kafka-setup script above
it simply uses the open source kafka client to create the topics. If the keystore/truststore location is not set, it simply does not set those parameters, so the above error is kafka client itself complaining that you need those parameters set
d
gotcha, makes sense now
functionality that can be done on datahub's end is to default to the JVM's keystore when SSL is configured