Hi, after updating to datahub v0.8.40 in our produ...
# troubleshoot
i
Hi, after updating to datahub v0.8.40 in our production system (which is weird as it works in the sandboxed dev), all users can authenticate via OICD but all content appears to be unaccessible with a “Unauthorized” message. I tried to control the policies on the
<datahub-url>/policies
only to get a
Copy code
Unauthorized to perform this action. Please contact your DataHub administrator. (code 403)
I wanted to login as a datahub user, but the logout just redirect me to the homepage, and in logs the
datahub-frontend
pod the following error:
Copy code
13:45:34 [application-akka.actor.default-dispatcher-47862] ERROR auth.sso.oidc.OidcCallbackLogic - Unable to renew the session. The session store may not support this feature
I tried also adding myself to the
user.props
but it does not any effect. Is there other ways to add policies? How would I go to debug this?
b
can you login as Datahub via <url to datahub>/logIn?
i
That url redirects me to the homepage. I tried in incognito and it redirected my to the OAuth2 proxy instead of asking for a password
s
Try using
https://<url to datahub>/login
if using https
i
Yep, that’s redirecting to OAuth2, as well
But, the authentication somehow works, it’s the authorization that fails
b
Authorization is failing? We have a known issue in v0.8.40 where tokens can intermittently fail in authorization (however - this should be rare). But v0.8. 41 should have addressed this issue. Can you try using that version? cc @incalculable-ocean-74010
i
As john mentioned, please upgrade to 0.8.41 and if errors still occur please share the gms logs.
Also, it is generally good pratice to have jaas authentication enabled for the datahub user with a securely keep password for that account which bypasses OIDC in case you need to debug the system and its policies.
i
I upgraded to version 0.8.41, and the authorization issues persist. The gms logs are crowed with similar messages:
Copy code
12:22:03.686 [qtp1873653341-136] WARN  c.d.a.a.AuthenticatorChain:70 - Authentication chain failed to resolve a valid authentication. Errors: [(com.datahub.authentication.authenticator.DataHubSystemAuthenticator,Failed to authenticate inbound request: Authorization header is missing 'Basic' prefix.), (com.datahub.authentication.authenticator.DataHubTokenAuthenticator,Failed to authenticate inbound request: Unable to verify the provided token.)]
i
Is this a consistent issue on all operations you perform? If so this isn't the issue that was fixed on 0.8.41. Most likely the policies in this datahub installation are somehow corrupted/inaccessible.
i
ok, is there a way to renew them?
it is consistently returning unauthorized in the UI, when trying to visit the specific page (eg, profile page, datasets, glossary terms, pipelines,…). Now, I tried to query my CorpUser urn on the swagger UI and there I can get data
i
@big-carpet-38439 can you help here? This looks like a policy issue. Is there a way to reset policies to their defaults?
a
Setting
AUTH_POLICIES_ENABLED=false
for the gms deployment didn't have any effect on the errors. Currently users are able to browse the lists of datasets, flows etc., but trying to actually view any entity results in an
Unauthorized
view
Turns out that all the policies have been deleted from the production instance, most likely during a version update. So like you mentioned, the best way seems to be to reset to the default policies, which would give us back the default "allow all" platform policy described here: https://datahubproject.io/docs/authorization/policies#managing-policies Is there a way to define the policies outside of the UI or export them, other than a db backup?
b
i also want to know actually, i have a hacky script that uses the python API to query all the policies and save them to json, but i realised that the json file cant be ingested like a regular metadata file. In the end i read the json and emit the aspect to gms instead.
i
@big-carpet-38439, so there is no way to backup and to restore the policies between different environments?
a
After some more digging, I found out that the policies still existed in the database, but not in the elastic index. Running the elastic index recreation job fixed the issues, and now the policies work again
b
Is this about migrating instances or upgrading?
In terms of migrating, you can certainly backup and restore using a normal MySQL dump, which is how we recommend backing up ALL data on DataHub.