Hi Everyone Upgraded Datahub from 0 8 19 to 0 8 32 version B DataHub #troubleshoot

Hi Everyone, Upgraded Datahub from 0.8.19 to 0.8.3...

handsome-football-66174

04/15/2022, 8:30 PM

Hi Everyone, Upgraded Datahub from 0.8.19 to 0.8.32 version. But when we try to access Analytics tab getting the following error -

handsome-football-66174

04/15/2022, 8:31 PM

Kubectl logs for the gms pod -

Copy code

Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<https://elasticsearchurl:443>], URI [/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"datahub_usage_event","node":"2yv0XgYiShOra6hP8DB1DA","reason":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}],"caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.","caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}},"status":400}
		at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302)
		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:272)
		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
		at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
		... 21 common frames omitted
Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]
	at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
	at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
	at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
	at org.elasticsearch.ElasticsearchException.failureFromXContent(ElasticsearchException.java:603)
	at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:179)
	... 24 common frames omitted

early-lamp-41924

04/15/2022, 8:34 PM

Hmn you didn’t run into this error before?

early-lamp-41924

04/15/2022, 8:35 PM

Are you using AWS opensearch?

handsome-football-66174

04/15/2022, 8:38 PM

@early-lamp-41924 - Nope ! This is the first time encountering it.

handsome-football-66174

04/15/2022, 8:39 PM

We are using Elasticsearch

early-lamp-41924

04/15/2022, 8:44 PM

So this means that the elasticsearch setup job did not run correctly

early-lamp-41924

04/15/2022, 8:45 PM

Please follow this process: 1. stop gms from running (kill container for docker, set numReplicas to 0 for kubernetes) 2. delete datahub_usage_event index by curling elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-delete-index.html 3. rerun elasticsearch-setup-job (with the correct parameters - i.e. USE_AWS_ELASTICSEARCH should not be set) 4. then start gms back again

plus1 1

handsome-football-66174

04/15/2022, 9:39 PM

Oh , is USE_AWS_ELASTICSEARCH not supposed to be set

handsome-football-66174

04/15/2022, 9:39 PM

will try out the above options and update

early-lamp-41924

04/15/2022, 9:42 PM

if you aren’t using AWS opensearch, you should either set that to false or not set it!

handsome-football-66174

04/18/2022, 3:17 PM

@early-lamp-41924 - for numReplicas can you share the path in yaml to set it. Also are you referring to this (attached screenshot , I see replicaCount in the deployment.yaml - is this the one that you are referring to )?

early-lamp-41924

04/18/2022, 4:02 PM

Yes!

handsome-football-66174

04/18/2022, 4:06 PM

Ok will update in a bit.

handsome-football-66174

04/18/2022, 5:28 PM

@early-lamp-41924 - Getting this error when deploying the pods - Warning Evicted 3m36s kubelet The node was low on resource: ephemeral-storage. Any suggestions on this.

early-lamp-41924

04/18/2022, 5:29 PM

when starting elasticsearch setup job?

handsome-football-66174

04/18/2022, 7:12 PM

@early-lamp-41924 - yes. Killed all the pods and redeployed was still facing the same. So switched to a different cluster, and followed the steps you shared. The Application got deployed without previous ephemeral storage issue, but still getting the same error , when accessing Analytics. Sharing the values.yaml file. ( Also minor correction to above is , we are using AWS elasticsearch , though not the opensearch engines )

values_debug.yaml

early-lamp-41924

04/18/2022, 7:12 PM

can you post the list of indices in your elasticsearch cluster?

early-lamp-41924

04/18/2022, 7:13 PM

early-lamp-41924

04/18/2022, 7:13 PM

so you are using AWS elasticsearch

early-lamp-41924

04/18/2022, 7:13 PM

please follow the same process with USE_AWS_ELASTICSEARCH=true

early-lamp-41924

04/18/2022, 7:13 PM

on elasticsearch setup job

handsome-football-66174

04/18/2022, 7:13 PM

Yes , kept it as true.

early-lamp-41924

04/18/2022, 7:13 PM

remember to delete indices for datahub_usage_event

early-lamp-41924

04/18/2022, 7:14 PM

so two things

early-lamp-41924

04/18/2022, 7:14 PM

please share the list of indices in the elasticsearch cluster

early-lamp-41924

04/18/2022, 7:14 PM

and the elasticsearch-setup-job logs

handsome-football-66174

04/18/2022, 7:16 PM

Copy code

List of Indices -
{".opendistro-ism-managed-index-history-2022.03.23-000103":{"aliases":{}},"mlmodelgroupindex_v2_1650034138207":{"aliases":{"mlmodelgroupindex_v2":{}}},"dataset_datasetprofileaspect_v1_1650034205591":{"aliases":{"dataset_datasetprofileaspect_v1":{}}},".opendistro-ism-managed-index-history-2022.03.21-000101":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.05-000116":{"aliases":{}},"datajob_datahubingestionrunsummaryaspect_v1":{"aliases":{}},"dataplatforminstanceindex_v2":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.09-000120":{"aliases":{}},"assertion_assertionruneventaspect_v1":{"aliases":{}},"dataset_operationaspect_v1":{"aliases":{}},"dashboardindex_v2_1650034160616":{"aliases":{"dashboardindex_v2":{}}},".opendistro-ism-managed-index-history-2022.04.01-000112":{"aliases":{}},"dataplatformindex_v2_1650034166557":{"aliases":{"dataplatformindex_v2":{}}},"schemafieldindex_v2_1650034144381":{"aliases":{"schemafieldindex_v2":{}}},"datahubpolicyindex_v2_1650034126926":{"aliases":{"datahubpolicyindex_v2":{}}},"glossarytermindex_v2_1650034155043":{"aliases":{"glossarytermindex_v2":{}}},".opendistro-ism-managed-index-history-2022.03.30-000110":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.02-000113":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.24-000104":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.14-000125":{"aliases":{}},"mlmodeldeploymentindex_v2_1650034166010":{"aliases":{"mlmodeldeploymentindex_v2":{}}},".opendistro-ism-managed-index-history-2022.03.27-000107":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.22-000102":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.25-000105":{"aliases":{}},"mlprimarykeyindex_v2_1650034155317":{"aliases":{"mlprimarykeyindex_v2":{}}},".opendistro-ism-config":{"aliases":{}},"datajobindex_v2_1650034138637":{"aliases":{"datajobindex_v2":{}}},"datahubsecretindex_v2":{"aliases":{}},"domainindex_v2":{"aliases":{}},"datahubexecutionrequestindex_v2":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.06-000117":{"aliases":{}},"datajob_datahubingestioncheckpointaspect_v1":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.11-000122":{"aliases":{}},"corpuserindex_v2_1650034166883":{"aliases":{"corpuserindex_v2":{}}},"dataflowindex_v2_1650034183058":{"aliases":{"dataflowindex_v2":{}}},"dataprocessindex_v2_1650034137624":{"aliases":{"dataprocessindex_v2":{}}},".opendistro-ism-managed-index-history-2022.04.15-000126":{"aliases":{}},"datahubretentionindex_v2":{"aliases":{}},"system_metadata_service_v1_1650034199383":{"aliases":{"system_metadata_service_v1":{}}},"mlfeaturetableindex_v2_1650034137894":{"aliases":{"mlfeaturetableindex_v2":{}}},".opendistro-ism-managed-index-history-2022.03.31-000111":{"aliases":{}},"dataset_datasetusagestatisticsaspect_v1_1650034206075":{"aliases":{"dataset_datasetusagestatisticsaspect_v1":{}}},"datasetindex_v2_1650034188843":{"aliases":{"datasetindex_v2":{}}},"containerindex_v2":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.28-000108":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.26-000106":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.18-000129":{"aliases":{".opendistro-ism-managed-index-history-write":{}}},".opendistro-ism-managed-index-history-2022.04.03-000114":{"aliases":{}},"notebookindex_v2":{"aliases":{}},"glossarynodeindex_v2_1650034177479":{"aliases":{"glossarynodeindex_v2":{}}},"datahubingestionsourceindex_v2":{"aliases":{}},"corpgroupindex_v2_1650034132332":{"aliases":{"corpgroupindex_v2":{}}},"datahub_usage_event":{"aliases":{}},"graph_service_v1":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.04-000115":{"aliases":{}},"chartindex_v2_1650034194111":{"aliases":{"chartindex_v2":{}}},".opendistro-ism-managed-index-history-2022.04.10-000121":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.29-000109":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.12-000123":{"aliases":{}},".opendistro-ism-managed-index-history-2022.03.20-000100":{"aliases":{}},"mlfeatureindex_v2_1650034177781":{"aliases":{"mlfeatureindex_v2":{}}},"tagindex_v2_1650034149765":{"aliases":{"tagindex_v2":{}}},"mlmodelindex_v2_1650034172207":{"aliases":{"mlmodelindex_v2":{}}},".tasks":{"aliases":{}},".kibana":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.13-000124":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.17-000128":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.16-000127":{"aliases":{}},"assertionindex_v2":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.08-000119":{"aliases":{}},".opendistro-ism-managed-index-history-2022.04.07-000118":{"aliases":{}}}%

handsome-football-66174

04/18/2022, 7:18 PM

Logs for Elasticsearch, unable to get it - shows - unable to retrieve container logs for docker:

handsome-football-66174

04/18/2022, 7:38 PM

used this to delete index - DELETE /datahub_usage_event

early-lamp-41924

04/18/2022, 7:51 PM

logs for elasticsearch-setup-job?

early-lamp-41924

04/18/2022, 7:52 PM

Did it actually run?

handsome-football-66174

04/18/2022, 8:10 PM

@early-lamp-41924 - It did run, but when I try to get the logs , that is the error I get.

early-lamp-41924

04/18/2022, 8:11 PM

You mean it says Completed right? Without logs, it is hard to say the commands inside actually ran

early-lamp-41924

04/18/2022, 8:11 PM

Are you building the container yourself?

handsome-football-66174

04/18/2022, 8:14 PM

Except for the frontend, everything is from the prebuilt images Yes the Elasticsearch-setup job completed.

handsome-football-66174

04/18/2022, 8:30 PM

Tried to repeat the process - 1. Killed all pods 2. Deployed prerequisites 3. Deleted Index using DELETE /datahub_usage_event 4. Deployed Datahub. (attached the elasticsearch setup job logs )

early-lamp-41924

04/18/2022, 8:31 PM

do you see index that looks like datahub_usage_event_00000

early-lamp-41924

04/18/2022, 8:32 PM

Sry could you also DELETE /_template/datahub_usage_event_index_template

plus1 1

handsome-football-66174

04/18/2022, 8:33 PM

No I dont see such an index. Also now all the pods are getting evicted ( after multiple deployments ) anything i should have taken care of ?

handsome-football-66174

04/18/2022, 8:34 PM

Ok let me delete that as well

handsome-football-66174

04/18/2022, 9:04 PM

@early-lamp-41924 - Could this be the reason for the pods getting evicted. https://datahubspace.slack.com/archives/C029A3M079U/p1645175959985509 Currently stuck with the cluster not responding.

early-lamp-41924

04/18/2022, 9:05 PM

Does the cluster have enough resources?

handsome-football-66174

04/18/2022, 9:10 PM

This is something we had been using.

early-lamp-41924

04/18/2022, 9:11 PM

Hmn this is something hard for us to help with as it is likely not related to DataHub itself. Can you check with your cloud infra team to see why they are getting evicted?

handsome-football-66174

04/18/2022, 9:12 PM

Sure, I am already in a conversation with them.

handsome-football-66174

04/18/2022, 9:17 PM

But is there a recommended set of configurations to use for kubernetes deployment.

early-lamp-41924

04/18/2022, 9:31 PM

Seems like this cluster is used for other purposes as well? Or is DataHub running 58 pods??

handsome-football-66174

04/19/2022, 1:04 AM

No this cluster is not used for anything else except Datahub.

early-lamp-41924

04/19/2022, 1:37 AM

can you post

kubectl get pods -n <<namespace>>

early-lamp-41924

04/19/2022, 1:39 AM

3 of the above nodes should be enough to support DataHub

👍 1

handsome-football-66174

04/20/2022, 9:10 PM

@early-lamp-41924 - Edited the cluster to have sufficient nodes and then redeployed the application. Everything looks good. Only this shows up .Need to fix this 🙂

early-lamp-41924

04/20/2022, 9:14 PM

Oh this only shows up when there is no metadata on the platform

early-lamp-41924

04/20/2022, 9:14 PM

Once you ingest, it should not show up

handsome-football-66174

04/20/2022, 9:15 PM

I see !

handsome-football-66174

04/20/2022, 9:16 PM

Let me ingest.

handsome-football-66174

04/21/2022, 2:47 AM

@early-lamp-41924 Was able to ingest the Data. But when I try to view the Dataset i ingested I am getting - Unauthorized exception.

early-lamp-41924

04/21/2022, 2:51 AM

Can you check the list of policies?

handsome-football-66174

04/21/2022, 2:52 AM

I dont see any policies 😞

early-lamp-41924

04/21/2022, 2:52 AM

There should be one that enables 'view entity page' for all users

early-lamp-41924

04/21/2022, 2:52 AM

Can you login with the admin account?

handsome-football-66174

04/21/2022, 2:52 AM

Even under admin account, I see only the settings page

early-lamp-41924

04/21/2022, 3:55 AM

Hmn the datahub account?

early-lamp-41924

04/21/2022, 3:55 AM

That should never be the case. We always have one account with admin privs

handsome-football-66174

04/21/2022, 1:12 PM

@early-lamp-41924 Looks like I messed up the DB. Fixed it.

handsome-football-66174

04/21/2022, 4:41 PM

@early-lamp-41924 - Is GraphQL visible by default ? Would like to disable it for certain users.

early-lamp-41924

04/21/2022, 4:49 PM

You mean graphiql?

handsome-football-66174

04/22/2022, 1:04 PM

Yes GraphiQL

handsome-football-66174

04/26/2022, 3:43 PM

Copy code

- 15:41:18.241 [Thread-6133] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:21 - Failed to execute DataFetcher
java.util.concurrent.CompletionException: java.lang.IllegalArgumentException: Failed to update urn:li:tag:DataType=PatientInsuranceData on urn:li:dataset:(urn:li:dataPlatform:glue,da-intelligentmn_qa.270_qa_source_intelligentmn,PROD). urn:li:tag:DataType=PatientInsuranceData does not exist.
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Failed to update urn:li:tag:DataType=PatientInsuranceData on urn:li:dataset:(urn:li:dataPlatform:glue,da-intelligentmn_qa.270_qa_source_intelligentmn,PROD). urn:li:tag:DataType=PatientInsuranceData does not exist.
	at com.linkedin.datahub.graphql.resolvers.mutate.util.LabelUtils.validateInput(LabelUtils.java:287)
	at com.linkedin.datahub.graphql.resolvers.mutate.AddTagResolver.lambda$get$0(AddTagResolver.java:36)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
	... 1 common frames omitted

handsome-football-66174

04/26/2022, 3:44 PM

@early-lamp-41924 - Getting the above error , when trying to add Tags

early-lamp-41924

04/26/2022, 3:48 PM

Is this from the ui? or are you using the api?

handsome-football-66174

04/26/2022, 3:57 PM

From UI

handsome-football-66174

04/26/2022, 4:01 PM

handsome-football-66174

04/26/2022, 4:03 PM

Trying to associate existing Tags to the datasets. Able to add new tags

handsome-football-66174

04/26/2022, 4:15 PM

I dont see these tags in the database though.

2 Views

Open in Slack

Previous Next