Hello! :wave: I'm experimenting with cloud deplo...
# all-things-deployment
a
Hello! πŸ‘‹ I'm experimenting with cloud deployment of DataHub. I see that Helm charts are still in
contrib
, and I have a few questions: 1. I saw that these are on the roadmap to be in the main line DataHub very soon, is that still the case? 2. Looking at the README, there is a line that says
Also,Β these can be installed on-prem or can be leveraged as managed service on any cloud platform.
(referring to Kafka, Elasticsearch, MySQL, and Neo4j). My team is a small one, so we're really interested in leveraging managed services wherever possible! I'd love to hear people's experiences doing this and more details on how that setup process works. Do you set up managed instances and then modify the charts'
values.yaml
to point at those instances? Thanks in advance, I'm still new to Helm!
πŸ™Œ 1
Addendum: We're open to either AWS or GCP, and are investigating both.
l
Hello Amanda - the helm charts will be moved over to the mainline soon. We're working with the community to make sure they're tested well. We also have demo.datahubproject.io setup using these charts
a
That's great to hear @loud-island-88694!
l
As for your question about a managed service, Acryl Data is providing hosted DataHub. We can discuss more if you are interested
a
We're going to want to host our own, most likely. We're likely to add features and customize DataHub (or whatever solution we choose, we're still investigating!)
l
Sounds good. let us know how the deployment goes. @early-lamp-41924 can help with any issues with the helm charts
a
Any tips on how to mix in managed services, like I mentioned?
l
On AWS, you can use RDS/Aurora for MySQL, MSK for Kafka, Managed Elastic. We don't have support for AWS neptune yet but we're working on it
Not sure if that was your question
a
Not quite--I'm asking how one deploys using Helm pointing at those managed services
l
ah I see. I'll let @early-lamp-41924 answer that as he is the expert
thankyou 1
h
Its going to depend on the services, but basically there are values like host, username and password, that you set and the containers take care of the rest at startup. We operate with self hosted neo4j, and hosted kafka and elasticsearch, and postgres db, so it definitely works to mix and match
a
Ok, neat! Is it really as simple as changing those values in
values.yaml
?
h
We use kustomize instead of helm, so dont take my word for it, but this is my understanding yes. πŸ˜„
a
Experimentation time! πŸ™‚
πŸ‘ 1
e
@acceptable-football-40437 Yes! We currently use helm to connect to these managed services. As long as you point the global values to the correct services, you should be good to go
One difference we saw from the local dev was initializing mysql. MySQL docker has a functionality of running some init script, but when we use managed RDB, we need to run it ourselves. The dockerfile is located in
/docker/mysql-setup
but it is not published to dockerhub, so you will need to publish this and use it! The kubernetes file for running this setup is already there in helm
And we are planning on moving it out of contrib very soon. Planning on finishing some more tests today and sending a PR out.
a
@early-lamp-41924 Sorry for the delayed response, duty called! To clarify on the MySQL front....so I have to publish the MySQL setup container to a container registry in order to run the Helm file?
e
Yes. Let me push for adding this to linkedin dockerhub, but for now you do have to publish yourself.
a
Got it, thanks! This might be a really obvious question, but--where do the
secretRef
callouts in
values.yaml
pull from?
e
so you have to add those secrets yourself!
you have to save the passwords that you created while provisioning managed instances like mysql-password
a
I figured πŸ™‚ but do they live in a Kubernetes secrets layer? environment variables? Some Helm functionality I don't yet grasp?
i.e. where should I put them?
e
kubectl create secret generic mysql-secrets --from-file=mysql-creds=./mysql-creds --namespace ${namespace}
using ^ or
kubectl create secret generic mysql-secrets --from-literal=mysql-creds=$PASSWORD --namespace ${namespace}
a
ooooooh
Perfect, thanks πŸ™‚
e
yeah these are adhoc. so need to run kubectl directly!
a
adhoc is OK for now!
Thanks for helping this newbie out thankyou
e
any time!
a
@early-lamp-41924 You mentioned above needing to publish the MySQL Setup Job image in order to run the setup on a managed service. Is this also tru if you're running everything in Kubernetes?
Context:
Copy code
Error: template: datahub/templates/mysql-setup-job.yml:37:28: executing "datahub/templates/mysql-setup-job.yml" at <.Values.mysqlSetupJob.image.repository>: nil pointer evaluating interface {}.repository
helm.go:81: [debug] template: datahub/templates/mysql-setup-job.yml:37:28: executing "datahub/templates/mysql-setup-job.yml" at <.Values.mysqlSetupJob.image.repository>: nil pointer evaluating interface {}.repository
When the MySQL Setup Job is set to
true
e
Hey @acceptable-football-40437 We uploaded the image to https://hub.docker.com/repository/docker/acryldata/datahub-mysql-setup
a
Yay! Thanks! I was getting set up to push one to GCR πŸ™‚
e
Copy code
image:
  repository: acryldata/datahub-kafka-setup
  tag: "latest"
would work!
awesome!!
a
πŸ€”
Copy code
Error
2021-04-21 16:27:41.997 EDTcub kafka-ready: error: too few arguments
e
Can we hop on a call real quick?
a
Hi @loud-island-88694! Circling back on this old thread, you mentioned that y'all are working on AWS Neptune support. Any updates?
l
We'll accelerate this if this is blocking your production deployment. When were you planning to deploy in production?
a
We're evaluating our graph db options. What's the current timetable?
l
within the next month