Hello everyone, i'm trying to create the first Per...
# troubleshoot
b
Hello everyone, i'm trying to create the first Personal Access Token programatically but got "401 Client Error: Unauthorized for url". This issue has already been mentioned in these threads but even following the steps described there it does not work : • https://app.slack.com/client/TUMKD5EGJ/search/search-eyJkIjoicHlqd3QiLCJxIjoiVTA0UjdRMUdNRUciLCJyIjoicHlqd3QifQ==/thread/CV2UVAPPG-1678295093.481849https://app.slack.com/client/TUMKD5EGJ/search/search-eyJkIjoicHlqd3QiLCJxIjoiVTA0Uj[…]NRUciLCJyIjoicHlqd3QifQ==/thread/C029A3M079U-1668589539.838859 Here are my steps: Configuration (Helm = app version: v0.10.0 – chart version : v0.2.151) • Set METADATA_SERVICE_AUTH_ENABLED var to true in helm values for datahub-gms & datahub-front • Enable metadata_service_authentication with no changes
Copy code
metadata_service_authentication:
      enabled: true
      systemClientId: "__datahub_system"
      systemClientSecret:
        secretRef: "datahub-auth-secrets"
        secretKey: "token_service_signing_key"
      tokenService:
        signingKey:
          secretRef: "datahub-auth-secrets"
          secretKey: "token_service_signing_key"
        salt:
          secretRef: "datahub-auth-secrets"
          secretKey: "token_service_salt"
      # Set to false if you'd like to provide your own auth secrets
      provisionSecrets:
        enabled: true
        autoGenerate: true
      # Only specify if autoGenerate set to false
      #  secretValues:
      #    secret: <secret value>
      #    signingKey: <signing key value>
      #    salt: <salt value>
=> I’ve now a secret with token_service_signing_key: f2E0BZoNKlr7CEu71kjZjAduRNCsePKS Create programmatically the access token • Decode an access token created on the UI and get the payload
Copy code
{
  "actorType": "USER",
  "actorId": "datahub",
  "type": "PERSONAL",
  "version": "2",
  "jti": "6ec82917-d39a-4c52-9a5e-5d4caacf6b7d",
  "sub": "datahub",
  "exp": 1680015431,
  "iss": "datahub-metadata-service"
}
• I validated the service key by recreating the token by my own means (just used https://jwt.io/ with payload, header and token signing key) • Create a new token in Python
Copy code
import jwt
import time

# I noticed that you have to encode the service key in ASCII to get the same verified signature as the token created on the UI (anyway I tested with or without for the same result)
secret_signing_key = "f2E0BZoNKlr7CEu71kjZjAduRNCsePKS".encode('ascii')  
payload = {
  "actorType": "USER",
  "actorId": "datahub",
  "type": "PERSONAL",
  "version": "2",
  "jti": "6ec82917-d39a-4c52-9a5e-5d4caacf6b7d",
  "sub": "datahub",
  "exp": 1680015431,
  "iss": "datahub-metadata-service"
}
header = { "alg": "HS256" }
token = jwt.encode(payload, secret, headers=header)
print(token)
eyJhbGciOiJIUzI1NiJ9…
• Decode my new access token to check if it is well built => all looks good *cURL (*Curl proposed when creating a token on the UI)
Copy code
curl -X POST "<http://datahub-front-url/api/graphql>" --header 'Authorization: Bearer eyJhbGciOiJIUzI1NiJ9… ' --header 'Content-Type: application/json' --data-raw '{"query": "{\n me {\n corpUser {\n username\n }\n }\n}","variables":{}}'
=> HTTP ERROR 401 Unauthorized to perform this action Datahub API
datahub ingest -c /tmp/ch_recipe.yml
ch_recipe.yml:
Copy code
source:
    type: clickhouse
    config:
        host_port: "clickhouse-install.clickhouse.svc.cluster.local:8123"
        username: ****
        password: ****
        platform_instance: DatabaseNameToBeIngested
        include_views: true
        include_tables: true
sink:
    type: "datahub-rest"
    config:
            server: "<http://datahub-gms.datahub.svc.cluster.local:8080>"
            token: "eyJhbGciOiJIUzI1NiJ9…."
=> 401 Client Error: Unauthorized for url All works fine if I put a token created on the UI. Questions Has anyone managed to create a token programmatically and used it for queries? Is it really possible to do that now? I also noticed (if I understood correctly) that if I create a token via the UI, retrieve it but delete it immediately afterwards, it's as if I simulate creating the token programmatically and get this result. If we can really create our own token with the token signing key, we should be able to use this token (present or not on the UI) to request datahub. On my side it doesn't work. I remain available if you need more information! 🙂 Thanks for your time and I hope someone can help me out!
a
Hi @busy-mechanic-8014, have you seen this doc? You may be running into issues with your own personal access token when trying to call the graphql https://datahubproject.io/docs/api/tutorials/references/generate-access-token
CC: @echoing-airport-49548
e
Hey @busy-mechanic-8014 is there a reason why you aren’t using our GraphQL API to generate the access token?
b
@astonishing-answer-96712 Hi, sorry for my absence but your link isn't working (404 Not found) 😞
@echoing-airport-49548 Hi, no particular reason but both don't working. I followed this doc (https://datahubproject.io/docs/api/graphql/token-management#generating-access-tokens). It perfectly worked if I use the GraphQL UI, but with the curl cmd I've "Failed to authenticate inbound request: Authorization header is missing Authorization header" on my gms pod. But how can I provide the Authorization part if I haven't token yet ? I know I can generate the first token on the UI but I want to do it programmaticly 😢 Here my curl :
Copy code
curl --location --request POST 'datahub-gms.datahub.svc.cluster.local:8080/api/graphql' \
--header 'X-DataHub-Actor: urn:li:corpuser:datahub' \
--header 'Content-Type: application/json' \
--data-raw '{ "query":"mutation { createAccessToken(input: { type: PERSONAL, actorUrn: \"urn:li:corpuser:datahub\", duration: ONE_HOUR, name: \"my personal token\" } ) { accessToken metadata { id name description} } }", "variables":{}}'
I also try to use datahub-front endpoint but nothing is happening in the log pods (front & gms).
a
@bulky-soccer-26729 may be able to help here
b
ah yes I believe this is just a limitation with this feature - if you enable metadata authentication and require an auth token, then you can't make programmatic api requests without it. so i believe you're required to generate a token through the UI and then use that token in your api requests
b
Ok thank you both for your help ! I'll stop trying so 🙂
For a last question, is it possible to directly add datasources (clickhouse, ...) during the deployment of datahub with a configMap / secret ? Which would save me from needing a token and adding them after the deployment is complete
b
hm how are you adding data sources now? do you mean like ingesting data from clickhouse etc.?
b
For now either I was using the CLI (pip install 'acryl-datahub[clickhouse]'), or I was using the front section Ingestion/Sources. And that's exactly what I mean