I am trying to run the MAE on K8s and it fails whe...
# getting-started
s
I am trying to run the MAE on K8s and it fails when it tries to check if there is a dataplatformindex_v2 on elasticSearch. I am using basic auth to connect to ES. I can see the initial connection uses the basic auth and returns a 200 but then appears to not use it when trying to verify the index. Is there a setting I should check?
e
Hey! What env variables are you setting right now?
Does GMS spin up correctly?
s
It starts to spin up and then fails and restarts after trying to create the index
I will send the env variables in a moment
Copy code
ELASTICSEARCH_USE_SSL, ELASTICSEARCH_USERNAME, and ELASTICSEARCH_PASSWORD
e
is this an admin user? what kind of permissions does the user have?
s
Not an admin user. Just has access to indexes prefixed with datahub_*
Here is what I see initially:
2021/06/09 22:40:02 Received 200 from <https://username:password@es-eks-dev.elastic.local.com:443>
(username and password are fake of course)
And then when it checks the index the username and password are not there:
22:42:48.712 [main] INFO  c.l.m.s.e.indexbuilder.IndexBuilder - Setting up index: datahub_dataplatformindex_v2
22:43:03.712 [main] WARN  o.s.b.w.s.c.AnnotationConfigServletWebServerApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'metadataAuditEventsProcessor' defined in URL [jar:file:/datahub/datahub-mae-consumer/bin/mae-consumer-job.jar!/BOOT-INF/classes!/com/linkedin/metadata/kafka/MetadataAuditEventsProcessor.class]: Bean instantiation via constructor failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [com.linkedin.metadata.kafka.MetadataAuditEventsProcessor]: Constructor threw exception; nested exception is ElasticsearchStatusException[method [HEAD], host [<https://es-eks-dev.elastic.local.com:443>], URI [/datahub_dataplatformindex_v2?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false], status line [HTTP/1.1 401 Unauthorized]]; nested: ResponseException[method [HEAD], host [<https://es-eks-dev.elastic.local.com:443>], URI [/datahub_dataplatformindex_v2?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false], status line [HTTP/1.1 401 Unauthorized]];
e
I reproduced the issue as well. I’ll send out a fix asap!
@sticky-television-18623 After adding manage and create_index previleges to the role, mae consumer successfully started. Can you confirm which priveleges your role has?
s
Were you using basic auth in your test? If I execute the request in postman as it is shown in the logs I get a 401 response. Then when I execute the same request with basic authentication I get a 404 response (which is what I would expect since the index has not been created yet).
e
I am using the same set of settings. ssl enabled and username and password set. The following is where the credentials are added https://github.com/linkedin/datahub/blob/82791008c3e8d682cc39dd67eaa36325d0afdb0a/[…]com/linkedin/gms/factory/common/RestHighLevelClientFactory.java I don’t think the client config callback would be taken into account in the logs. But it throwing 401 is suspicious.
s
I was looking at that. This weekend I will try to debug through it to see if anything stands out. It threw me off with the initial call to elasticsearch with the username and password in the url. I found that is actually made from docker before spring even starts to boot and not through the ES client.
Plus I am trying to confirm the permissions
e
Yeah. I am concerned that it is printing out the password into the logs. That should be fixed
👍 1
s
Also I was able to confirm that I have admin rights for any index that starts with datahub
We have open distro on top of ES and that handles the security and authorization. Maybe the ES library you are using does not play nice with open distro.
e
Oh. We also use AWS ES, so I was actually worried whether you are not using it.
Can you print out the role that you are using?
s
e
Hmn. seems like it should work. can you try attaching a debugger to RestHighlevelClientFactory line 93 to see if its actually reaching there with the correct settings?
You do have to modify the helm chart a tiny bit.
Add
Copy code
extraEnvs:
- name: JAVA_OPTS
  value: "-agentlib:jdwp=transport=dt_socket,address=5005,server=y,suspend=y"
👍 1
under datahub-mce-consumer in values.yaml
s
I am away so I will try it this weekend
e
and in datahub-mae-consumer/templates/deployment.yaml
add
Copy code
- name: debug
              containerPort: 5005
              protocol: TCP
then you should be able to port-forward and listen to the port!
👍 1
Let me know how it goes! If nothing works over the weekend, let’s sync up on Monday!
s
Thank you. I will let you know.
Good morning. The problem is a combination of the pre-flight check of elasticsearch which uses username/password in the url (very bad) and a password that has to be url encoded for the pre-flight check to get it to work. Then of course when the elasticsearch credentials are created they are invalid since the password is url encoded. Maybe for the pre-flight check an additional env variable could be set for -wait-http-header?
e
Oh so the password you inputted is the url encoded one. Very interesting. Do you mind putting up a quick PR for this? We will merge it right away!
s
PR submitted
e
Thanks!!
BTW does elasticsearch-setup container work correctly? It also uses this type of auth through curl.
s
It does work correctly. Maybe tomorrow I can look at adding a similar methodology to it to help prevent clear text passwords.
e
Sounds good! Thank for contributing!
👍 1