Hi We need to point to external kafka, mysql and e...
# getting-started
f
Hi We need to point to external kafka, mysql and elastic search from datahub, currently we point to dockerized versions of them. Please let us know what all changes we need to do in order to achieve this.
r
If you're working on kubernetes I guess you can use the helm chart https://datahubproject.io/docs/datahub-kubernetes and point to your kafka broker and elastic in your values.yaml file
That said I'd love a more detailed walkthrough of an MVP setup in the docs
b
Yes - that's correct. Agree we should have an example of what the configuration would look like here. If either of you have it up and running contribution would be greatly appreciated!
e
Checkout https://datahubproject.io/docs/deploy/aws#use-aws-managed-services-for-the-storage-layer as well. Though it’s AWS related, the places that specify which part of values.yaml need to be changed should be applicable
s
@early-lamp-41924 Had a question about prerequisites-cp-schema-registry. This is part of the commercial version of kafka, right? Is this a hard dependency of datahub? That means to run datahub a commercial license is required
e
Hey! Schema registry is not part of the commercial package. (and kafka rest proxy i believe)
This is why we kept it there while moving kafka and zookeeper to an open source version
s
It also says https://github.com/linkedin/datahub/blob/master/datahub-kubernetes/prerequisites/Chart.yaml#L28 but this helm chart will deploy the commercial parts also
@mammoth-bear-12532 any clarification on the commercial license requirement would be great
It turned out our company has a license for confluent kafka and we can get a cluster. So I'll use that. It would be good to clarify this for others looking to use datahub for their use through open source only tech
e
My bad. the documentation is wrong. will update
But my above comment still holds
schema registry is under the community license
👍 1
f
@big-carpet-38439, @early-lamp-41924 This is how it would look like
b
Yes!
Notice that the following containers are not strictly required: • Confluent schema UI • Confluent schema registry • Kibana • Kafka Rest Proxy • Kafka Topics UI
Also note that Neo4j can be replaced with Elastic if you want it to
r
I'm curious what the purpose of schema registry is for data hub then?
f
Basically I wanted to understand the places where we will need changes in order to achieve this

https://datahubspace.slack.com/files/U026DFCLJHM/F027FF7KDTL/image.png