A general question has anyone in the community hav...
# all-things-deployment
n
A general question has anyone in the community have used the AWS Open Search as the index store successfully?
m
@early-lamp-41924 can probably help here
thankyou 1
e
Yes!
Hmn we use it with the exact same setting. ssl plus basic auth
and it has worked for us
Possibly some special characters in the pw that could cause this
n
hmm...current password has a lot of wierd special characters will try with another cluster with a simple password. Thanks! @early-lamp-41924 when tried to connect in
Copy code
curl '<https://username:password@xxxxxx.es.amazonaws.com:443>'
using this pattern of curl also doesn't work.... Thats why was suspecting if Opensearch only support -u parameter for username/password... Where as connecting it this way
curl -u '<username:password>' '<https://xxxxxx.es.amazonaws.com:443>'
works.
e
interesting
does it not work for the cluster with a simple password?
suspecting the former has some restriction on the password
we should move to the latter tho
n
I haven't tried that...will try with a simple password and get back
e
so just to clarify, it doesn’t pass that initial step
where it pings that link to make sure its ready right?
n
yes...in the
-wait $ELASTICSEARCH_PROTOCOL://$ELASTICSEARCH_HOST_URL:$ELASTICSEARCH_PORT -wait-http-header "$ELASTICSEARCH_AUTH_HEADER" \
e
so in that case, elasticsearch-setup-job also fails?
n
I provisioned an open search cluster manually in AWS and plan on creating the index manually once the connection issues are resolved. So to answer your question yes it would but I haven’t tried it
e
FYI the indices are created on MAE consumer start up (If running together in the gms pod, they will get created on gms start-up). Elasticsearch setup job currently just sets up the weekly index for datahub usage events! So you shouldn’t need to create indices manually
👍 1
Need to dig into dockerize a bit more. Checking if we can add options to the wait arguments
I’ll put out the PR by tmr! found a solution
👍 1
thankyou 1
Here is the PR https://github.com/linkedin/datahub/pull/3596 feel free to test it locally and see if it works in your setup
I have also pushed the image to acryldata/datahub-gms:test if you would like to test in k8s
n
Awesome! Thanks for the quick response. let me test and confirm at the earliest.
Copy code
Received 403 from <https://zzzzzzzzzzz.us-east-1.es.amazonaws.com:443>. Sleeping 1s
Running with the above docker image throws
403
Command used for testing...
Copy code
sudo docker run -d \
    --name=datahub-gms \
    --env-file docker/datahub-gms/env/docker.test.env \
    -p 8080:8080 \
    acryldata/datahub-gms:test
Did it succeed with OpenSearch for you?
e
Do you have any rbac set up for the account you are using?
I have seen that when I use an account that has reduced access, it is unable to plainly curl
and throws 403
n
I am able to plainly curl though...
Copy code
curl -XGET -u 'username:password' '<https://zzzzzzzzz.us-east-1.es.amazonaws.com/_cat/health>'
The above curl succeeds but fails through datahub-gms app... Also created another Opensearch cluster with simple creds still the same issue persists...
e
can you try removing
_cat/health ?
So the only thing it’s doing is getting the result of below
Copy code
echo -ne 'username:password' | base64
and then curling
Copy code
curl https://<<es-domain>>:<<es-port>> --header "Authorization:Basic <<result-from-above>>"
n
Datahub env variable does not have
_cat/health
Copy code
ELASTICSEARCH_HOST=<http://xxxxxx.es.amazonaws.com|xxxxxx.es.amazonaws.com>
ELASTICSEARCH_PORT=443
ELASTICSEARCH_USERNAME=username
ELASTICSEARCH_PASSWORD=password
USE_AWS_ELASTICSEARCH=true
ELASTICSEARCH_USE_SSL=true
e
Yup. So does the above curl work without _cat/health?
So basically curling without any added paths to the url requires this permission cluster:monitor/main
and _cat/health requires this permission cluster:monitor/health
another way we could solve this is to not wait at all if an env variable is set, since the wait arguments in dockerize is very restrictive
n
I am trying the same by removing the wait...^^
e
Let us know. If that works for you, we will add an env var for skipping waits!
m
@early-lamp-41924: I see you have an open PR around this
is that good to review? or still under testing
e
Still testing
Want to make sure we have a working one for Arun before pushing it in!
m
sounds good
n
I am still testing it....installing all the dev tools on EC2 to build and debug....
@early-lamp-41924 After removing the wait script in the datahub-gms start script I was able to successfully start
datahub-gms
I can push a commit with an env variable to skip the wait step...let me know
e
That would be great otherwise, I can just add it to this PR
since there was a github issue reported before where the password is printed in log line, so we prob still need to ship this pr
👍 1
n
Okay! go ahead. I can send another one just to skip the wait. Agree?
e
I’m adding it to the PR right now. I’ll ping you once ready
while i’m at it, adding skips to all other ones as well
for flexibility
n
Cool. I will test out once you push it.
@early-lamp-41924 I don't see the update yet to skip the wait in the PR - https://github.com/linkedin/datahub/pull/3596/files Let me know if i can help here.
e
Yeah. Doing some final testing right now! Thank you for helping out! I’ll ping on this thread once the working image is ready!
thankyou 1
n
Awesome! Looking forward to it @early-lamp-41924
e
Forgot to mention @nutritious-bird-77396 This PR has been merged. You can set SKIP_ELASTICSEARCH_CHECK to true to skip it!
thankyou 1
n
Thanks @early-lamp-41924! Appreciate it.