EHi Hi everybody! I'm having an error when trying...
# troubleshoot
c
EHi Hi everybody! I'm having an error when trying to use secrets for the UI ingestion. I configured the recipe for bigquery using the following structure:
Copy code
source:
    type: bigquery
    config:
        project_id: '${DATAPLATFORM_PROJECT_ID}'
        credential:
            project_id: '${DATAPLATFORM_PROJECT_ID}'
            private_key_id: '${BIGQUERY_PRIVATE_KEY_ID}'
            private_key: '${BIGQUERY_PRIVATE_KEY}'
            client_email: '${BIGQUERY_CLIENT_EMAIL}'
            client_id: '${BIGQUERY_CLIENT_ID}'
sink:
    type: datahub-rest
    config:
        server: '<http://30.222.164.39:8080>'
And I'm getting the following error:
Copy code
"Failed to resolve secret with name DATAPLATFORM_PROJECT_ID. Aborting recipe execution."
I double-checked the secrets names as sugested by the UI Ingestion Guide and they are correct. Have you guys gone through this or could you give me any tips on how to proceed? Thanks in advance for your attention! 🙂
s
The actions container needs to be able to use the IP address to contact GMS. If this is running in k8s or docker can you go inside the actions container and see if you can curl from that container?
Copy code
curl <http://30.222.164.39:8080/gms/config>
That will test the connectivity
plus1 1
Also request you to please put recipes/screenshots etc. in threads in future. It helps the team look at all the messages if the main channels don't have really long messages and details are in threads
plus1 1
b
@calm-television-89033! Thanks for the question. It's possible that we are unable to access your backend datahub-gms server, which would result in an inability to resolve this secret. Like aseem mentioned, we need to ensure that the datahub-actions container has network access to datahub-gms!
plus1 1
c
@square-activity-64562 and @big-carpet-38439 I'm using k8s and as i'm also using
METADATA_SERVICE_AUTH_ENABLED="true"
in frontend and in datahub-gms paired with Google Authentication, i've realized that i was making a mistake by not adding the token in the recipe, so as instructed in the guide, i've added the token exposed in the recipe for testing, but the failure to resolve the secrets continued. Trying to run the command as you suggested
curl <http://30.222.164.39:8080/config>
inside the datahub-action container worked just fine, but when i try using
curl <https://datacatalog.falconi.com/api/gms/config>
i have to add the
-H 'Authorization: Bearer <access-token>'
to work, otherwise nothing is returned. Recipe
Copy code
source:
    type: bigquery
    config:
        project_id: '${DATAPLATFORM_PROJECT_ID}'
        credential:
            project_id: '${DATAPLATFORM_PROJECT_ID}'
            private_key_id: '${BIGQUERY_PRIVATE_KEY_ID}'
            private_key: '${BIGQUERY_PRIVATE_KEY}'
            client_email: '${BIGQUERY_CLIENT_EMAIL}'
            client_id: '${BIGQUERY_CLIENT_ID}'
sink:
    type: datahub-rest
    config:
        server: '<http://35.222.164.39:8080>'
        token: eyJhbGciOiJIUzI1NiJ9.eyJhY3Rv...
Error:
Copy code
~~~~ Execution Summary ~~~~

RUN_INGEST - {'errors': [],
 'exec_id': 'adf6f957-ce18-423c-8ad7-e9228c214c69',
 'infos': ['2022-03-24 19:30:28.951974 [exec_id=adf6f957-ce18-423c-8ad7-e9228c214c69] INFO: Starting execution for task with name=RUN_INGEST',
           '2022-03-24 19:30:28.965176 [exec_id=adf6f957-ce18-423c-8ad7-e9228c214c69] INFO: Caught exception EXECUTING '
           'task_id=adf6f957-ce18-423c-8ad7-e9228c214c69, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\n'
           '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/default_executor.py", line 119, in execute_task\n'
           '    self.event_loop.run_until_complete(task_future)\n'
           '  File "/usr/local/lib/python3.9/site-packages/nest_asyncio.py", line 81, in run_until_complete\n'
           '    return f.result()\n'
           '  File "/usr/local/lib/python3.9/asyncio/futures.py", line 201, in result\n'
           '    raise self._exception\n'
           '  File "/usr/local/lib/python3.9/asyncio/tasks.py", line 256, in __step\n'
           '    result = coro.send(None)\n'
           '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 74, in execute\n'
           '    recipe: dict = self._resolve_recipe(validated_args.recipe, ctx)\n'
           '  File "/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py", line 147, in _resolve_recipe\n'
           '    raise TaskError(f"Failed to resolve secret with name {match}. Aborting recipe execution.")\n'
           'acryl.executor.execution.task.TaskError: Failed to resolve secret with name DATAPLATFORM_PROJECT_ID. Aborting recipe execution.\n']}
Execution finished with errors.
b
@calm-television-89033 Just to rule something out, can you try to replace the DATAPLATFORM_PROJECT_ID in th recipe with the actual project id
Keep all the other secrets, then re-run and see if it works
based on what you're saying things should work
And one other thing I'd ask is that you check the
datahub-gms
pod to see if you find any related logs
plus1 1
c
The error happens in the next secret to resolve: 'Failed to resolve secret with name BIGQUERY_PRIVATE_KEY_ID. Aborting recipe execution.' A test I was thinking about is trying to get the secret value from inside the datahub-action container, could you pass me the command for the GraphQL API call?
b
@calm-television-89033 Interesting... Do you see anything in datahub-gms logs? Also, have you changed these env variables: DATAHUB_SYSTEM_CLIENT_SECRET DATAHUB_SYSTEM_CLIENT_ID
If you do a call from the container it won't work mainly because you have to have some special permissions
But actually if you use your access token should work.. Can you try issuing this query? https://github.com/datahub-project/datahub/blob/master/datahub-graphql-core/src/main/resources/ingestion.graphql#L13
Something like
Copy code
query getSecretValues {
    getSecretValues(input: { secrets: [ "BIGQUERY_PRIVATE_KEY_ID"] }) { name value } 
}
plus1 1
Maybe can you try to recreate the secret?
It's strange because the container can clearly talk to the backend if you are getting these logs from the UI
We may need to have a call 🙂
plus1 1
c
Hi again! I recreated the secrets and checked the datahub-gms logs after a new run, but I'm not sure what's going on (attached logs), I'm probably messing something up but I couldn't run the GraphQL query either. I'd really like to have a call and maybe get some advice haha, I'm available anytime ❤️
Hi @big-carpet-38439 how are you? Sorry to bother you again, but I still couldn't solve the problem, can you help me? 🙃
b
Hi Thales. Rebooting my context here!
Okay this is very useful information
It seems that there is something wrong with encrypting and decrypting your secrets. Can you do us a favor and ensure that you have a secret in your kuberenetes namespaces called the following:
Copy code
datahub-encryption-secrets
Also can you confirm that this value has not changed in your values.yaml:
Copy code
provisionSecret: true
If you do indeed have a datahub-encryption-secrets secret, we can view it using
Copy code
kubectl edit secret datahub-encryption-secrets -n <namespace>
Also want to ensure you haven't done anything custom with these values.yaml parameters:
Copy code
.Values.global.datahub.encryptionKey.secretRef, .Values.global.datahub.encryptionKey.secretKey
It seems that one problem may be that we are getting the wrong encryption key here. This could be a symptom of generating secrets, then generating a new encryption key, then restarting datahub
c
@big-carpet-38439 I assumed I would only need to change in custom values file what ​​I need to override from the original. Do I need to add this
provisionSecret: true
tag to my custom values.yaml file? This is how it's cofigured right now.
values.yaml:
Copy code
datahub-gms:
  extraEnvs:
    - name: METADATA_SERVICE_AUTH_ENABLED
      value: "true"

datahub-frontend:
  image:
    repository: us-docker.pkg.dev/data-platform-327618/datacatalog-frontend/datacatalog-frontend
    tag: "0.1.e5ebasf39asdffd3e360cf4a5f0asdf65ceaf5"
  extraEnvs:
    - name: METADATA_SERVICE_AUTH_ENABLED
      value: "true"
    - name: AUTH_OIDC_ENABLED
      value: "true"
    - name: AUTH_OIDC_CLIENT_ID
      value: "<http://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com|xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com>"
    - name: AUTH_OIDC_CLIENT_SECRET
      value: "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
    - name: AUTH_OIDC_DISCOVERY_URI
      value: "<https://accounts.google.com/.well-known/openid-configuration>"
    - name: AUTH_OIDC_BASE_URL
      value: "<https://datacatalog.falconi.com>"
    - name: AUTH_OIDC_USER_NAME_CLAIM
      value: "email"
    - name: AUTH_OIDC_USER_NAME_CLAIM_REGEX
      value: ([^@]+)
    - name: AUTH_OIDC_SCOPE
      value: "openid profile email"
To do the deployment im using this command:
helm upgrade datahub datahub/datahub -f values.yaml
Also i have runned the command
kubectl edit secret datahub-encryption-secrets
and the value for
datahub-encryption-secrets
is present in the output. Thanks for your help and patience.
b
Going to try to reproduce this locally
Okay so I think we found a pretty fun little issue here 🙂
Still triaging... If you have time, can you try doing this: 1. kubectl edit secret datahub-encryption-secrets -n <namespace> 2. Update encryption_key_secret to be equal to the following:
a3FNTXp3SE5VeDdKUTNZdWxkaUU=
3. Restart DataHub deployments (gms) 4. Remove and recreate your secrets 5. Run recipe again This is based on a suspicion im currently trying to validate
I believe this has to do with a special character ending up in your encryption key 😞
If you use the new key secret above, we remove that special character
c
Hi @big-carpet-38439! It worked perfectly after following your step by step, apparently was what you suspected! Thank you so much for the help man, you and the datahub team are awesome!
teamwork 1
b
Thank you for the kind words, @calm-television-89033! And sorry for the trouble here - we need to ensure none of the generated secrets contain these strange characters going forward 🙂 cc @early-lamp-41924
❤️ 1
c
Hi @big-carpet-38439 I have the same issue reported here after upgrading to datahub 0.8.38 (it was working fine before) Should I go and perform the suggested steps you mentioned above? my current
encryption_key_secret
value as follow
"encryption_key_secret":"dFk3VlFTelRTNjJOcTlGMFZOcjE="
Logs from
datahub-action
Copy code
Exception: Failed to retrieve secrets from DataHub.
[2022-06-10 13:03:33,538] DEBUG    {datahub.emitter.rest_emitter:233} - Attempting to emit to DataHub GMS; using curl equivalent to:
curl -X POST -H 'User-Agent: python-requests/2.27.1' -H 'Accept-Encoding: gzip, deflate' -H 'Accept: */*' -H 'Connection: keep-alive' -H 'X-RestLi-Protocol-Version: 2.0.0' -H 'Content-Type: application/json' -H 'Authorization: Basic __datahub_system:MXc6v273fGAsoHLxxHNz5EGnhcf15a6b' --data '{"proposal": {"entityType": "dataHubExecutionRequest", "entityKeyAspect": {"value": "{\"id\": \"33d70b99-d93b-4f14-aff9-bcb8919ea6aa\"}", "contentType": "application/json"}, "changeType": "UPSERT", "aspectName": "dataHubExecutionRequestResult", "aspect": {"value": "{\"status\": \"FAILURE\", \"startTimeMs\": 1654866213455, \"durationMs\": 83, \"report\": \"~~~~ Execution Summary ~~~~\\n\\nRUN_INGEST - {'"'"'errors'"'"': [],\\n '"'"'exec_id'"'"': '"'"'33d70b99-d93b-4f14-aff9-bcb8919ea6aa'"'"',\\n '"'"'infos'"'"': ['"'"'2022-06-10 13:03:33.517274 [exec_id=33d70b99-d93b-4f14-aff9-bcb8919ea6aa] INFO: Starting execution for task with name=RUN_INGEST'"'"',\\n           '"'"'2022-06-10 13:03:33.537627 [exec_id=33d70b99-d93b-4f14-aff9-bcb8919ea6aa] INFO: Caught exception EXECUTING '"'"'\\n           '"'"'task_id=33d70b99-d93b-4f14-aff9-bcb8919ea6aa, name=RUN_INGEST, stacktrace=Traceback (most recent call last):\\\\n'"'"'\\n           '"'"'  File \\\"/usr/local/lib/python3.9/site-packages/acryl/executor/execution/default_executor.py\\\", line 119, in execute_task\\\\n'"'"'\\n           '"'"'    self.event_loop.run_until_complete(task_future)\\\\n'"'"'\\n           '"'"'  File \\\"/usr/local/lib/python3.9/site-packages/nest_asyncio.py\\\", line 81, in run_until_complete\\\\n'"'"'\\n           '"'"'    return f.result()\\\\n'"'"'\\n           '"'"'  File \\\"/usr/local/lib/python3.9/asyncio/futures.py\\\", line 201, in result\\\\n'"'"'\\n           '"'"'    raise self._exception\\\\n'"'"'\\n           '"'"'  File \\\"/usr/local/lib/python3.9/asyncio/tasks.py\\\", line 256, in __step\\\\n'"'"'\\n           '"'"'    result = coro.send(None)\\\\n'"'"'\\n           '"'"'  File \\\"/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py\\\", line 74, in execute\\\\n'"'"'\\n           '"'"'    recipe: dict = self._resolve_recipe(validated_args.recipe, ctx)\\\\n'"'"'\\n           '"'"'  File \\\"/usr/local/lib/python3.9/site-packages/acryl/executor/execution/sub_process_ingestion_task.py\\\", line 147, in _resolve_recipe\\\\n'"'"'\\n           '"'"'    raise TaskError(f\\\"Failed to resolve secret with name {match}. Aborting recipe execution.\\\")\\\\n'"'"'\\n           '"'"'acryl.executor.execution.task.TaskError: Failed to resolve secret with name SNOWFLAKE_USERNAME. Aborting recipe execution.\\\\n'"'"']}\\nExecution finished with errors.\\n\"}", "contentType": "application/json"}}}' '<http://datahub-datahub-gms:8080/aspects?action=ingestProposal>'
cc: @most-plumber-32123
Went through the mentioned steps and the the issue got resolved! Thanks @big-carpet-38439
b
Awesome- thanks for letting me know Muhamed!
p
@big-carpet-38439 I am also facing this issue but the steps didn’t work for me. Could you please help here?
s
@crooked-market-47728