Hey guys, can you tell me please if we need to ins...
# all-things-deployment
f
Hey guys, can you tell me please if we need to instal somehow neo4j with aws setup or no? In AWS deployment it says that graph DB will be soon available with Aws Neptune but what about now? How it works now?
i
Hello Hrachya, By default DataHub uses elasticsearch for the graph DB. You can use AWS OpenSearch as a managed version of that. Alternatively to ElasticSearch DataHub allows you to replace the underlying graph DB with neo4j BUT you will still have to use elastic search for the search indicies
Regarding AWS Neptune, there is not yet support for it. We welcome any and all contributions from the community to add that support though!
f
@incalculable-ocean-74010 thanks for your answers. Did I got it right that current setup uses elastic search for graph index also and neo4j part in documentation is something that can be used in the future? By this saying it doesn't metter it is aws deployment or local docker image current version of DataHub uses only elastic search for both search index and graph index?
i
By default DataHub uses elastic search for both search index and graph index, yes. You can choose to run the graph index in neo4j but the search index can only be run in elastic.
And yes it does not matter whether you are running DataHub in AWS or local docker. DataHub’s source code does not support any other search index technologies besides ElasticSearch at this time.
f
Thanks Pedro, by choosing neo4j I can assume that it will stay inside kubernetis in AWS deployment yes ? (Unless we will not have support for Neptun)
i
DataHub does not care where Neo4J is deployed as long as DataHub can connect to it.
f
Got it and my last one :) What is the benefit of going for neo4j instead of using elastic search as permanent solution? Is it mostly related to efficiency or some DataHub architecture requirements? (Asking to understand if it worths do go with neo4j in our deployment or EC also fine)
i
It is a more scalable albeit more expensive solution. We had some community members interested in using Neo4J as it was already a part of their stack. I would say go with ES if you don’t run neo4j already.
f
@incalculable-ocean-74010 got it, thanks a lot 🙏
w
Just to clarify … it sounds like if we wanted to use Neptune as our primary datastore, we’d need to fork the repo to potentially develop Neptune + OpenSearch integrations with Datahub? I’m on an Operational Excellence team and envision using both the UI for quick questions / simpler graph queries and a central queryable store (Neptune - unless we can get green lit for Neo4J due to licensing) that would likely have more than what our implementation of Datahub would need to surface for non-tech users / more complex graph queries (ref Github issue)