Namely, LDAP or Kerberos?
# ingestion
i
Namely, LDAP or Kerberos?
b
LDAP should be username / password right? I do not believe kerberos is supported yet, though it is on the roadmap @gray-shoe-75895 to confirm
g
I think Pedro is asking about using LDAP for authentication, not about ingesting from LDAP - we haven't tested them ourselves, but we use SQLAlchemy internally which should have support for it
๐Ÿ‘ 1
i
Thatโ€™s correct @gray-shoe-75895
About Kerberos I believe pyhive does not support everything. A colleague of mine opened a PR but it has not yet been merged: https://github.com/dropbox/PyHive/pull/325
I am under the impression that the ingestion framework may not work very with Kerberos or LDAP logins for hive since the connection string has a fixed format: https://github.com/linkedin/datahub/blob/cda1ce458974dda1cdc59b2c0957369e9524ea5e/metadata-ingestion/src/datahub/ingestion/source/sql_common.py#L62
What are your thoughts @gray-shoe-75895 ?
g
Yep I think you're right - need to introduce more flexibility into that config model, or alternatively just accept an SQLAlchemy connection string
The main fixes required are that the password should be optional and the connect arg options need to be more visible - did I understand that correctly?
i
Something like that yes
I will try some hard coded changes (๐Ÿ”จ) on my side to confirm if it works and let you know
โœ… 1
Hey @gray-shoe-75895 sorry for taking so long to come back to this. I was able to connect with kerberos after making the config model more flexible: here is the changes if you are interested: https://github.com/pedro93/datahub/commit/0e556e8eb7fba3ca6fc698a37d408d973fe72ead
If you agree I can open a PR with these changes.
g
Yep looks pretty good - opening a PR would be great. A minor nit is that we should use Optional[str] = None instead of the empty string
i
b
this is awesome, thanks pedro!
โœ… 1