Hi folks, Has anyone experienced with the front-e...
# troubleshoot
f
Hi folks, Has anyone experienced with the front-end port forwarding? It frequently drops connection.
Copy code
E0803 10:23:11.850679   89160 portforward.go:406] an error occurred forwarding 9002 -> 9002: error forwarding port 9002 to pod 39e9085d08f2eee680eec4bb5665835613d931532e5ca0688443fa60a75a9d7f, uid : failed to execute portforward in network namespace "/var/run/netns/cni-9494401e-109c-0e8f-c535-11ae11edfdce": read tcp4 127.0.0.1:56872->127.0.0.1:9002: read: connection reset by peer
E0803 10:23:11.852843   89160 portforward.go:234] lost connection to pod
b
isn't this specific to whichever infra you're using to host the containers?
f
FYI, I deploy DataHub on EKS.
Step to reproduce: After deploying, expose the frontend by kubectl port forwarding. Open DataHubUI -> wait for minutes… Then it will be lost connection to pod.
datahub-datahub-frontend
log:
Copy code
Forwarding from 127.0.0.1:9002 -> 9002
Forwarding from [::1]:9002 -> 9002
Handling connection for 9002
Handling connection for 9002
Handling connection for 9002
Handling connection for 9002
Handling connection for 9002
Handling connection for 9002
Handling connection for 9002
E0803 11:25:30.463145    3717 portforward.go:406] an error occurred forwarding 9002 -> 9002: error forwarding port 9002 to pod e564df2c676121af49a8943d3d42e3fce5a91d19a73cf7f21029f6fbf94fce4f, uid : failed to execute portforward in network namespace "/var/run/netns/cni-9bca40bb-8d26-4026-3513-f86b5fc9cf54": read tcp4 127.0.0.1:56938->127.0.0.1:9002: read: connection reset by peer
E0803 11:25:30.463161    3717 portforward.go:406] an error occurred forwarding 9002 -> 9002: error forwarding port 9002 to pod e564df2c676121af49a8943d3d42e3fce5a91d19a73cf7f21029f6fbf94fce4f, uid : failed to execute portforward in network namespace "/var/run/netns/cni-9bca40bb-8d26-4026-3513-f86b5fc9cf54": read tcp4 127.0.0.1:56940->127.0.0.1:9002: read: connection reset by peer
E0803 11:25:30.463219    3717 portforward.go:406] an error occurred forwarding 9002 -> 9002: error forwarding port 9002 to pod e564df2c676121af49a8943d3d42e3fce5a91d19a73cf7f21029f6fbf94fce4f, uid : failed to execute portforward in network namespace "/var/run/netns/cni-9bca40bb-8d26-4026-3513-f86b5fc9cf54": read tcp4 127.0.0.1:56936->127.0.0.1:9002: read: connection reset by peer
E0803 11:25:30.466160    3717 portforward.go:234] lost connection to pod
GMS log:
Copy code
04:24:13.255 [Thread-269] WARN  org.elasticsearch.client.RestClient:65 - request [HEAD <http://datahub-elastic:9200/datahub_usage_event?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false>] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
04:24:13.257 [Thread-269] WARN  org.elasticsearch.client.RestClient:65 - request [HEAD <http://datahub-elastic:9200/datahub_usage_event?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false>] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
04:24:13.259 [Thread-269] WARN  org.elasticsearch.client.RestClient:65 - request [HEAD <http://datahub-elastic:9200/datahub_usage_event?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false>] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
04:24:13.261 [Thread-270] WARN  org.elasticsearch.client.RestClient:65 - request [HEAD <http://datahub-elastic:9200/datahub_usage_event?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false>] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
04:24:13.265 [Thread-270] WARN  org.elasticsearch.client.RestClient:65 - request [POST <http://datahub-elastic:9200/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
04:24:55.132 [pool-4-thread-1] WARN  org.elasticsearch.client.RestClient:65 - request [POST <http://datahub-elastic:9200/datahubpolicyindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true>] returned 1 warnings: [299 Elasticsearch-7.17.3-5ad023604c8d7416c9eb6c0eadb62b14e766caff "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
i
Hello @famous-florist-7218 Is the frontend pod crashing frequently, performing heavy operations or blocked by downstream requests (I.e: A request to GMS that takes a long time to process)? If yes that might be why, it could also very well be that your EKS deployment is unstable. This is possible if you are using spot VMs. When port-forwarding Kubernetes expects that the underlying pods to which it connects keep connectivity alive using heart beats. If the frontend pod is blocked for some reason and does not ping the kubernetes process running the port-forward then the connection will be reset.
f
@incalculable-ocean-74010 You’re right. Port-forwarding on Kubernetes used to debug. It doesn’t alive for hours, we need a heartbeats checking. In my case, I do some ingress configurations to access directly to frontend service.
i
What type of debugging are you doing? If it’s some sort of development I would suggest doing it locally
f
I’ve checked the pod’s statistic, services log…then, I found that the service type of
datahub-datahub-frontend
was
LoadBalancer
(this is default value). Because I used ingress config from our devops-infra that already had an ALB within our VPC, so I had to change the service type to
ClusterIP
.
i
So it’s fixed then?
f
Yup. It works now.
@incalculable-ocean-74010 Do you know the reason why DataHub UI doesn’t load anything? I’ve setup a bigquery-integration, its job runs successfully. But the UI shows nothing. I’ve checked the metadata store, and found that bigguery metadata is loaded.
b
You could query ES as well and see if the dataset index has indexed the data
f
Thanks @better-orange-49102 Please find the log below.
Copy code
❯ curl -X GET '<http://localhost:9200/_cat/indices?v>'
health status index                                                    uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   datajobindex_v2                                          CTKtCBXiQri0zJ4RyPz5MA   1   1          0            0       226b           226b
yellow open   dataset_datasetprofileaspect_v1                          CHCe3MrAS1iLmDR8T5ba7g   1   1          0            0       226b           226b
yellow open   datahubsecretindex_v2                                    ajqMSDukR9y7nS3yC1uoZw   1   1          0            0       226b           226b
yellow open   mlmodelindex_v2                                          qgsvDBIBS2Sapa9XogCQ7Q   1   1          0            0       226b           226b
yellow open   dataflowindex_v2                                         _ART1_MLRyOMZzZBCEuN2w   1   1          0            0       226b           226b
yellow open   mlmodelgroupindex_v2                                     mRUt9jfCSX6EDL3k2NVbuQ   1   1          0            0       226b           226b
yellow open   assertionindex_v2                                        HpcWN0NUQNqkGL43qITaHg   1   1          0            0       226b           226b
yellow open   datahubpolicyindex_v2                                    oLSKHafuREWt5F1fHjngoQ   1   1          5            0     10.8kb         10.8kb
yellow open   corpuserindex_v2                                         Fs6M5PP_T5KatRjc85B0mQ   1   1          0            0       226b           226b
yellow open   dataprocessindex_v2                                      ZIapXodMT2u5zqgDbppi2A   1   1          0            0       226b           226b
yellow open   chartindex_v2                                            owDAa5-jQYmwaayTUU6rFA   1   1          0            0       226b           226b
green  open   .geoip_databases                                         Ak23V0siScOyHyvgfNlaag   1   0         39            0     36.9mb         36.9mb
yellow open   tagindex_v2                                              PoVLS4iwT2qlVgEDHkNocQ   1   1          0            0       226b           226b
yellow open   mlmodeldeploymentindex_v2                                wfUIqOmUQhmrZbDX4TLOkA   1   1          0            0       226b           226b
yellow open   datahubexecutionrequestindex_v2_1659499608176            37-GXTaqRY6TxnVdxao8rw   1   1          2            0     25.5kb         25.5kb
yellow open   datajob_datahubingestioncheckpointaspect_v1              F-AHPM6YS7Co4cvFj_jbBA   1   1          0            0       226b           226b
yellow open   dataplatforminstanceindex_v2                             CuWdhxOzQPuM13F_eTN18Q   1   1          0            0       226b           226b
yellow open   dashboardindex_v2                                        lOtjY4QxTXmXmK5zJANBiQ   1   1          0            0       226b           226b
yellow open   assertion_assertionruneventaspect_v1                     eISPYtlxTta3XQFSlhXNZg   1   1          0            0       226b           226b
yellow open   datasetindex_v2                                          W0vLWGoKR5yxCOZa_VGxVA   1   1          0            0       226b           226b
yellow open   telemetryindex_v2                                        wMZWwzzbSPKsyybqIXh6zA   1   1          0            0       226b           226b
yellow open   mlfeatureindex_v2                                        xgBueZI4QZmjPN84nFVH1g   1   1          0            0       226b           226b
yellow open   dashboard_dashboardusagestatisticsaspect_v1              GhqSfGj-QCy7G6EnEcKApw   1   1          0            0       226b           226b
yellow open   datajob_datahubingestionrunsummaryaspect_v1              0w75SGLKS4S9LDQu-5CJhg   1   1          0            0       226b           226b
yellow open   dataplatformindex_v2                                     DwNhZVMVRACsQN1wCZCyyQ   1   1          0            0       226b           226b
yellow open   datahub_usage_event                                      qoGryfmxRZyGVbqqD3F2hA   1   1         12            0     50.4kb         50.4kb
yellow open   dataprocessinstanceindex_v2                              lSxxVmwvTOG0bcRh3xFD2Q   1   1          0            0       226b           226b
yellow open   glossarynodeindex_v2                                     -5BhKJ3aS9qOIlrOQmnQ9A   1   1          0            0       226b           226b
yellow open   datahubingestionsourceindex_v2                           1sQFE-lmTWW5I0pGsYhf4A   1   1          1            0      5.6kb          5.6kb
yellow open   system_metadata_service_v1_1659499616783                 Eini-RkDTiCLvRDjpGxE9w   1   1          7            1     21.8kb         21.8kb
yellow open   invitetokenindex_v2                                      A3nGt-i7SsOUXtobCw33KA   1   1          0            0       226b           226b
yellow open   datahubretentionindex_v2                                 c9eere35TEKwywhq_SCTOA   1   1          0            0       226b           226b
yellow open   graph_service_v1                                         n8sZTBA1SmiaUqmS7Ckrww   1   1          1            0      5.9kb          5.9kb
yellow open   dataprocessinstance_dataprocessinstanceruneventaspect_v1 Q2-r6lkRRLOW5w8uQpG1WQ   1   1          0            0       226b           226b
yellow open   dataset_operationaspect_v1                               AGbE6zsUQxidpbROqHdgcA   1   1          0            0       226b           226b
yellow open   datahubaccesstokenindex_v2                               1PEEZ_C8R-e2uoYT0uBqxQ   1   1          0            0       226b           226b
yellow open   containerindex_v2                                        dau_xUgpTGiDoYE8Qsovqw   1   1          0            0       226b           226b
green  open   .tasks                                                   R-JSxdgFQlCsqLicd5bjhg   1   0          2            0     13.8kb         13.8kb
yellow open   schemafieldindex_v2                                      7puKiGsgRSas00QwK42EXg   1   1          0            0       226b           226b
yellow open   domainindex_v2                                           rDAFvjgkSFSuaj9UamcYug   1   1          0            0       226b           226b
yellow open   testindex_v2                                             F8tDjqM1T6uvsU7BN4KW_w   1   1          0            0       226b           226b
yellow open   mlfeaturetableindex_v2                                   Gsy9nYIHS8-C5yEFfDyvnQ   1   1          0            0       226b           226b
yellow open   notebookindex_v2                                         iuMC5NTIQVuqr8dPuEo3sQ   1   1          0            0       226b           226b
yellow open   datahubupgradeindex_v2                                   qRpWWS9pQiC8PdHpYfLPXA   1   1          0            0       226b           226b
yellow open   glossarytermindex_v2                                     8ImqL4QbSdiUa8weR9GGGA   1   1          0            0       226b           226b
yellow open   mlprimarykeyindex_v2                                     JMciAnulToi2YlYSESRErQ   1   1          0            0       226b           226b
yellow open   corpgroupindex_v2                                        eniE4JsRQSisF8l0H-113w   1   1          0            0       226b           226b
yellow open   dataset_datasetusagestatisticsaspect_v1                  61fZT0rsTcSsKq7k9GxNNA   1   1          0            0       226b           226b
It seems like
metadata_aspect_v2
has not been indexed in ES.
b
Yup datasetindex_v2 is empty. Needs to be populated. GMS pod logs show anything interesting?
i
You should be able to run the restore indices job from the datahub upgrade helm chart to force the indexing on ES
f
@better-orange-49102 it’s so weird 😕
Copy code
11:39:56.432 [pool-11-thread-1] ERROR c.l.d.g.a.service.AnalyticsService:264 - Search query failed: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
11:39:56.436 [pool-11-thread-1] ERROR o.s.s.s.TaskUtils$LoggingErrorHandler:95 - Unexpected error occurred in scheduled task
java.lang.RuntimeException: Search query failed:
	at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:265)
	at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.getHighlights(AnalyticsService.java:236)
	at com.linkedin.gms.factory.telemetry.DailyReport.dailyReport(DailyReport.java:76)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]
	at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:187)
	at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1892)
	at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1869)
	at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1626)
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583)
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553)
	at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1069)
	at com.linkedin.datahub.graphql.analytics.service.AnalyticsService.executeAndExtract(AnalyticsService.java:260)
	... 15 common frames omitted
	Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [<http://datahub-elastic:9200>], URI [/datahub_usage_event/_search?typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request]
Warnings: [[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices.]
{"error":{"root_cause":[{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"datahub_usage_event","node":"hjWpwRt7Tg-iDnfSq_SaCA","reason":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}],"caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.","caused_by":{"type":"illegal_argument_exception","reason":"Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory."}}},"status":400}
		at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:302)
		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:272)
		at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
		at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
		... 19 common frames omitted
Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]
	at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
	at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
	at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
	at org.elasticsearch.ElasticsearchException.failureFromXContent(ElasticsearchException.java:603)
	at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:179)
	... 22 common frames omitted
Caused by: org.elasticsearch.ElasticsearchException: Elasticsearch exception [type=illegal_argument_exception, reason=Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [browserId] in order to load field data by uninverting the inverted index. Note that this can use significant memory.]
	at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:496)
	at org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:407)
	at org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:437)
	... 26 common frames omitted
Thanks @incalculable-ocean-74010, I’m trying 😊
b
Haven't seen logs like this before where ES complains about incorrect queries from GMS, @incalculable-ocean-74010
f
The restore indices job did the trick.
Copy code
# kubectl create job --from=cronjob/<<release-name>>-datahub-restore-indices-job-template datahub-restore-indices-job
Thank you so much! @better-orange-49102 &@incalculable-ocean-74010 😊