Hello How do we configure datahub to pull metadat...
# ingestion
r
Hello How do we configure datahub to pull metadata from a SSL enabled ES cluster. While trying to configure host: "https://ip:9200" results in assertion error: host contains bad character. If we omit the scheme then we get connection error: caused by: ProtocolError datahub version v0.8.26, ES version: 7.5.x
From the code it appears that the ssl is not yet supported If i want to contribute how do i test whether the code works as intended.
s
You can refer to this guide https://datahubproject.io/docs/metadata-ingestion/developing/ to see how to make changes and test them locally
d
Hey ! @rich-policeman-92383, we also need the same feature, our ES cluster does not expose plain http. Do you plan to make a contribution here? or else maybe we can consider adding that support but my knowledge is very limited with ES. Could you maybe share your estimation on complexity of this work to make us have an initial thought to consider šŸ˜„
r
I also do not have that much expertise in python. As far as i can think of here's how it should be implemented: • ES official library for python already supports https and datahub client is using ES library • We need to extend this class to support the https method • Some other changes might be required I will try it this weekend but do no consider this as a guarantee.
d
No expectations but know that this would also great for us too šŸš€ I'll wait to hear from you. Thanks for reporting and summarizing the work needed @rich-policeman-92383, appreciated.
r
Pull request raised https://github.com/linkedin/datahub/pull/4191 I have tested it and i was able to ingest metadata from our TLS enabled ES cluster.
If the PR gets merged i will update the datahub docs. Now host should be specified as http://eshost:9200 or https://eshost:9200
d
You are the best @rich-policeman-92383, thank you so much. I'll also try my best to contribute to this project to have a part on its growth and being