Title
m

microscopic-mechanic-13766

07/22/2022, 9:00 AM
Good morning, I am having some trouble with the configuration of the environment variables of the service datahub-actions. My problem is that, although I set the following variables in the docker-compose:
KAFKA_BOOTSTRAP_SERVER=broker1:9092 
KAFKA_SCHEMAREGISTRY_URL=<http://schema-registry:8081>      
KAFKA_PROPERTIES_SASL_KERBEROS_SERVICE_NAME=kafka
SPRING_KAFKA_PROPERTIES_SECURITY_PROTOCOL=SASL_PLAINTEXT
KAFKA_PROPERTIES_SASL_JAAS_CONFIG=com.sun.security.auth.module.Krb5LoginModule required principal='datahub/<realm>@<realm>' useKeyTab=true storeKey=true keyTab='/keytab/datahub.keytab'
and they are succesfully wrote as env variables (because when I execute the command
env
they appear), the actions service keeps printing the following error:
FAIL|rdkafka#consumer-1| [thrd:broker1:9092/bootstrap]: broker1:9092/bootstrap: Disconnected: verify that security.protocol is correctly configured, broker might require SASL authentication (after 340ms in state UP, 3 identical error(s) suppressed)
Why is this error printed?? It might be related to the fact that the variables are not read correctly, but I don't understand why it happens only on that specific variable. I am using v0.8.41 for both the gms and front, for the actions the release 0.0.4 for the
acryldata/datahub-actions
image and 0.8.41 version for the CLI.
i

incalculable-ocean-74010

07/23/2022, 7:17 AM
Hello Pablo, This is a known limitation of datahub actions that does not support SASL authentication in Kafka. We are aware and working to address it
m

microscopic-mechanic-13766

07/25/2022, 7:24 AM
Hi Pedro, First of all thank you for the response. Just to know, is it intended to be in the following version or is it planned to be a bit more "long-term"? I am just asking this because I haven't seen it in the roadmap as a task.
i

incalculable-ocean-74010

07/25/2022, 7:55 AM
I am not certain, hopefully it should be out in a 1 month timeline. What docker image are you using for datahub actions?
m

microscopic-mechanic-13766

07/25/2022, 7:57 AM
I am currently using
acryldata/datahub-actions:v0.0.4
o

orange-night-91387

07/27/2022, 3:39 PM
The key issue here is some missing properties from the yaml file within the image. You need all of these properties in the actions.yaml/executor.yaml file:
connection:  
    consumer_config:
          security.protocol: ${KAFKA_PROPERTIES_SECURITY_PROTOCOL:-PLAINTEXT}
          sasl.mechanism: ${KAFKA_PROPERTIES_SASL_MECHANISM:-PLAIN}
          sasl.username: ${KAFKA_PROPERTIES_SASL_USERNAME}
          sasl.password: ${KAFKA_PROPERTIES_SASL_PASSWORD}
JAAS config is specific to Java and won't work with the Actions pod so you need to set it as sasl.username & sasl.password instead. You should be able to create a yaml file based on the base one with these properties.
NOTE: if the username & password are not set you'll get a property binding exception with no default
m

microscopic-mechanic-13766

07/28/2022, 11:22 AM
That username and password will be the ones that will be used to access Kafka right? In my case, Kafka isn't configured to authenticate users via user/password but via TGTs. So to make that connection possible between the actions framework and Kafka, is this the only current way to do it?
o

orange-night-91387

07/28/2022, 3:41 PM
Ah sorry, missed that. You just need to use the Python semantics for the properties instead of JAAS. For Kerberos keytabs I think it's:
# Broker service name
sasl.kerberos.service.name=$SERVICENAME

# Client keytab location
sasl.kerberos.keytab=/etc/security/keytabs/${CLIENT_NAME}.keytab

# sasl.kerberos.principal
sasl.kerberos.principal=${CLIENT_NAME}/${CLIENT_HOST}
Reference: https://github.com/edenhill/librdkafka/wiki/Using-SASL-with-librdkafka#5-configure-kafka-client-on-client-host
m

microscopic-mechanic-13766

07/29/2022, 8:55 AM
I have written such variables in the JAVA_OPTS and I have erased the
sasl.username
and
sasl.password
from the
executor.yaml
but left the other two properties (as if they are not set an error is printed saying that the attribute
security.protocol
is not configured). The thing is that if I set the default value of such attribute to
-PLAINTEXT
this error is printed:
Configuration property `sasl.mechanism` set to `PLAIN` but `security.protocol` is not configured for SASL: recommend setting `security.protocol` to SASL_SSL or SASL_PLAINTEXT
But if its value is set to either
SASL_SSL
or
SASL_PLAINTEXT
I get the following error:
Exception: Failed to instantiate Actions Pipeline using config {'name': 'ingestion_executor', 'source': {'type': 'kafka', 'config': {'connection': {'bootstrap': 'broker1:9092', 'schema_registry_url': '<http://localhost:8081>', 'consumer_config': {'security.protocol': 'SASL_SSL', 'sasl.mechanism': 'PLAIN'}}, 'topic_routes': {'mcl': 'MetadataChangeLog_Versioned_v1', 'pe': 'PlatformEvent_v1'}}}, 'filter': {'event_type': 'MetadataChangeLogEvent_v1', 'event': {'entityType': 'dataHubExecutionRequest', 'changeType': 'UPSERT', 'aspectName': ['dataHubExecutionRequestInput', 'dataHubExecutionRequestSignal'], 'aspect': {'value': {'executorId': 'default'}}}}, 'action': {'type': 'executor', 'config': {'executor_id': 'default'}}, 'datahub': {'server': '<http://datahub-gms:8080>', 'extra_headers': {'Authorization': 'Basic __datahub_system:JohnSnowKnowsNothing'}}}
i

incalculable-ocean-74010

08/01/2022, 10:55 AM
Is there no stack trace or additional information in the error for
SASL_SSL
?
m

microscopic-mechanic-13766

08/01/2022, 11:08 AM
The
executor.yaml
I am using is the following.
# Copyright 2021 Acryl Data, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#    <http://www.apache.org/licenses/LICENSE-2.0>
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
name: "ingestion_executor"
source:
  type: "kafka"
  config:
    connection:
      bootstrap: ${KAFKA_BOOTSTRAP_SERVER:-localhost:9092}
      schema_registry_url: ${SCHEMA_REGISTRY_URL:-<http://localhost:8081>}
      consumer_config:
        security.protocol: ${KAFKA_PROPERTIES_SECURTIY_PROTOCOL:-SASL_PLAINTEXT}
        sasl.mechanisms: ${KAFKA_PROPERTIES_SASL_MECHANISM:-PLAIN}
    topic_routes:
      mcl: ${METADATA_CHANGE_LOG_VERSIONED_TOPIC_NAME:-MetadataChangeLog_Versioned_v1}
      pe: ${PLATFORM_EVENT_TOPIC_NAME:-PlatformEvent_v1}
filter:
  event_type: "MetadataChangeLogEvent_v1"
  event:
    entityType: "dataHubExecutionRequest"
    changeType: "UPSERT"
    aspectName:
      - "dataHubExecutionRequestInput"
      - "dataHubExecutionRequestSignal"
    aspect:
      value:
        executorId: "${EXECUTOR_ID:-default}"
action:
  type: "executor"
  config:
    executor_id: "${EXECUTOR_ID:-default}"
datahub:
  server: "http://${DATAHUB_GMS_HOST:-localhost}:${DATAHUB_GMS_PORT:-8080}"
  extra_headers:
    Authorization: "Basic ${DATAHUB_SYSTEM_CLIENT_ID:-__datahub_system}:${DATAHUB_SYSTEM_CLIENT_SECRET:-JohnSnowKnowsNothing}"
i

incalculable-ocean-74010

08/01/2022, 11:11 AM
This is the issue:
KafkaException: KafkaError{code=_INVALID_ARG,val=-186,str="Failed to create consumer: sasl.username and sasl.password must be set"}
m

microscopic-mechanic-13766

08/01/2022, 11:17 AM
The thing is that my Kafka isn't configured to authenticate users via user/password but via TGTs. I tried indicating GSSAPI as the mechanism for SASL to use but I think it doesn't exist in Kafka. The last thing I did was to put these variables in the datahub-actions docker-compose but didn't suceeded:
KAFKA_PROPERTIES_SASL_KERBEROS_SERVICE_NAME=kafka 
KAFKA_PROPERTIES_SASL_KERBEROS_KEYTAB=/etc/security/keytabs/datahubfront.keytab
KAFKA_PROPERTIES_SASL_KERBEROS_PRINCIPAL='datahubfront/<realm>'
i

incalculable-ocean-74010

08/01/2022, 11:35 AM
I tried indicating GSSAPI as the mechanism for SASL
You mean you set this?
sasl.mechanism=GSSAPI
Did you also set jaas config for tgt? I’m looking at the example given here: https://help.mulesoft.com/s/article/Kafka-connector-with-Kerberos-Could-not-renew-TGT-and-TimeoutException
Also, what version of Kafka did you deploy?
m

microscopic-mechanic-13766

08/01/2022, 12:02 PM
Yes, that is the property I meant
I did not set the JAAS config because as Ryan said:
You just need to use the Python semantics for the properties instead of JAAS. For Kerberos keytabs I think it's:

# Broker service name
sasl.kerberos.service.name=$SERVICENAME

# Client keytab location
sasl.kerberos.keytab=/etc/security/keytabs/${CLIENT_NAME}.keytab

# sasl.kerberos.principal
sasl.kerberos.principal=${CLIENT_NAME}/${CLIENT_HOST}
I used those properties as it can be seen in the previous message
The version of Kafka I am using is the 2.8.4
i

incalculable-ocean-74010

08/01/2022, 1:45 PM
Shouldn’t
sasl.kerberos.service.name=$SERVICENAME
be
sasl.kerberos.service.name=${SERVICENAME}
? Are all these environment variables set in your deployment?
o

orange-night-91387

08/01/2022, 2:30 PM
The last thing I did was to put these variables in the datahub-actions docker-compose but didn't suceeded:
These also need to be in the executor yaml
GSSAPI won't work as it's not a supported mechanism unfortunately at this time though. It looks like we need to add
sudo apt-get install libsasl2-modules-gssapi-mit
to our container set up to get this working
m

microscopic-mechanic-13766

08/01/2022, 3:54 PM
Correct Pedro, I'll fix it
Even with the addition of those variables to the executor.yaml, the
KafkaException: KafkaError{code=_INVALID_ARG,val=-186,str="Failed to create consumer: sasl.username and sasl.password must be set"}
arises
o

orange-night-91387

08/01/2022, 4:10 PM
Yeah, if it needs to use the GSSAPI sasl mechanism to utilize the keytabs it's not supported at this time, sorry 😞 I think adding the above library to our actions container should work. Once that gets added and GSSAPI is configured it should try to use the keytabs instead of user/pass I believe
m

microscopic-mechanic-13766

08/04/2022, 12:58 PM
I have updated Kafka in order to accept the user/password authentication (datahub/datahub). The thing is that I get the following error:
sasl_<plaintext://broker1:9092/bootstrap>: SASL authentication error: Authentication failed: Invalid username or password (after 148ms in state AUTH_REQ, 1 identical error(s) suppressed)
I don't understand why I get this error. This might be related to obtaining the following error in the Kafka container
WARN unable to return groups for user datahub (org.apache.hadoop.security.ShellBasedUnixGroupsMapping)
 PartialGroupNameException The user name 'datahub' is not found. id: 'datahub': no such user
But the user datahub is created in the mentioned container and I still get it. Any previous experience with this errors?
o

orange-night-91387

08/04/2022, 3:44 PM
Hmm no haven't seen that one before. Not sure where Hadoop is coming into the mix with Kafka auth 🤔
m

microscopic-mechanic-13766

08/05/2022, 7:27 AM
Don't know it either to be honest. The problem dissapeared as the definition of such user was corretly done. I was defining my user like:
username="datahub"
password="datahub"
But this definition is for connections between brokers. The real way to define them is like this:
use_datahub="datahub";
Maybe it arises a Hadoop error as Kafka uses Hadoop's configuration to obtain the principal, so maybe it happens something similar with the users. I am not really sure that it is like this, but I am almost completely sure that Kafka uses Hadoop's conf for something related to the principals.