Hi, is there an easy way to “reset” my datahub ins...
# ingestion
b
Hi, is there an easy way to “reset” my datahub instance? By reset I mean remove all the ingested metadata. I’ve (probably foolishly) deleted all the indices on ES and truncated tables in mysql and now I get when I try to browse datasets/dashboards/charts/pipelines. I’m running
v0.8.6
.
Copy code
06:09:48 [pool-5-thread-1] ERROR c.l.common.callback.CallbackAdapter - Failed to convert callback error, original exception follows:
com.linkedin.r2.message.rest.RestException: Received error 500 from server for URI <http://data-platform-datahub-gms:8080/entities>
	at com.linkedin.r2.transport.http.common.HttpBridge$1.onResponse(HttpBridge.java:76)
	at com.linkedin.r2.transport.http.client.rest.ExecutionCallback.lambda$onResponse$0(ExecutionCallback.java:64)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
06:09:48 [Thread-1814] ERROR c.l.d.g.r.browse.BrowseResolver - Failed to execute browse: entity type: DATASET, path: [], filters: null, start: 0, count: 10 com.linkedin.r2.RemoteInvocationException cannot be cast to com.linkedin.restli.client.RestLiResponseException
06:09:48 [Thread-1814] WARN  n.g.e.SimpleDataFetcherExceptionHandler - Exception while fetching data (/browse) : java.lang.RuntimeException: Failed to execute browse: entity type: DATASET, path: [], filters: null, start: 0, count: 10
java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to execute browse: entity type: DATASET, path: [], filters: null, start: 0, count: 10
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Failed to execute browse: entity type: DATASET, path: [], filters: null, start: 0, count: 10
	at com.linkedin.datahub.graphql.resolvers.browse.BrowseResolver.lambda$get$1(BrowseResolver.java:68)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
	... 1 common frames omitted
Caused by: java.lang.ClassCastException: com.linkedin.r2.RemoteInvocationException cannot be cast to com.linkedin.restli.client.RestLiResponseException
	at com.linkedin.entity.client.EntityClient.sendClientRequest(EntityClient.java:69)
	at com.linkedin.entity.client.EntityClient.browse(EntityClient.java:188)
	at com.linkedin.datahub.graphql.types.dataset.DatasetType.browse(DatasetType.java:136)
	at com.linkedin.datahub.graphql.resolvers.browse.BrowseResolver.lambda$get$1(BrowseResolver.java:52)
	... 2 common frames omitted
06:09:48 [application-akka.actor.default-dispatcher-4290] ERROR react.controllers.GraphQLController - Errors while executing graphQL query: "query getBrowseResults($input: BrowseInput!) {\n  browse(input: $input) {\n    entities {\n      urn\n      type\n      ... on Dataset {\n        name\n        origin\n        description\n        platform {\n          name\n          info {\n            logoUrl\n            __typename\n          }\n          __typename\n        }\n        tags\n        ownership {\n          ...ownershipFields\n          __typename\n        }\n        globalTags {\n          ...globalTagsFields\n          __typename\n        }\n        __typename\n      }\n      ... on Dashboard {\n        urn\n        type\n        tool\n        dashboardId\n        info {\n          name\n          description\n          externalUrl\n          access\n          lastModified {\n            time\n            __typename\n          }\n          __typename\n        }\n        ownership {\n          ...ownershipFields\n          __typename\n        }\n        globalTags {\n          ...globalTagsFields\n          __typename\n        }\n        __typename\n      }\n      ... on GlossaryTerm {\n        name\n        ownership {\n          ...ownershipFields\n          __typename\n        }\n        glossaryTermInfo {\n          definition\n          termSource\n          sourceRef\n          sourceUrl\n          customProperties {\n            key\n            value\n            __typename\n          }\n          __typename\n        }\n        __typename\n      }\n      ... on Chart {\n        urn\n        type\n        tool\n        chartId\n        info {\n          name\n          description\n          externalUrl\n          type\n          access\n          lastModified {\n            time\n            __typename\n          }\n          __typename\n        }\n        ownership {\n          ...ownershipFields\n          __typename\n        }\n        globalTags {\n          ...globalTagsFields\n          __typename\n        }\n        __typename\n      }\n      ... on DataFlow {\n        urn\n        type\n        orchestrator\n        flowId\n        cluster\n        info {\n          name\n          description\n          project\n          __typename\n        }\n        ownership {\n          ...ownershipFields\n          __typename\n        }\n        globalTags {\n          ...globalTagsFields\n          __typename\n        }\n        __typename\n      }\n      __typename\n    }\n    start\n    count\n    total\n    metadata {\n      path\n      groups {\n        name\n        count\n        __typename\n      }\n      totalNumEntities\n      __typename\n    }\n    __typename\n  }\n}\n\nfragment ownershipFields on Ownership {\n  owners {\n    owner {\n      ... on CorpUser {\n        urn\n        type\n        username\n        info {\n          active\n          displayName\n          title\n          email\n          firstName\n          lastName\n          fullName\n          __typename\n        }\n        editableInfo {\n          pictureLink\n          __typename\n        }\n        __typename\n      }\n      ... on CorpGroup {\n        urn\n        type\n        name\n        info {\n          email\n          admins {\n            urn\n            username\n            info {\n              active\n              displayName\n              title\n              email\n              firstName\n              lastName\n              fullName\n              __typename\n            }\n            editableInfo {\n              pictureLink\n              teams\n              skills\n              __typename\n            }\n            __typename\n          }\n          members {\n            urn\n            username\n            info {\n              active\n              displayName\n              title\n              email\n              firstName\n              lastName\n              fullName\n              __typename\n            }\n            editableInfo {\n              pictureLink\n              teams\n              skills\n              __typename\n            }\n            __typename\n          }\n          groups\n          __typename\n        }\n        __typename\n      }\n      __typename\n    }\n    type\n    __typename\n  }\n  lastModified {\n    time\n    __typename\n  }\n  __typename\n}\n\nfragment globalTagsFields on GlobalTags {\n  tags {\n    tag {\n      urn\n      name\n      description\n      __typename\n    }\n    __typename\n  }\n  __typename\n}\n", result: {errors=[{message=Exception while fetching data (/browse) : java.lang.RuntimeException: Failed to execute browse: entity type: DATASET, path: [], filters: null, start: 0, count: 10, locations=[{line=2, column=3}], path=[browse], extensions={classification=DataFetchingException}}], data={browse=null}}, errors: [ExceptionWhileDataFetching{path=[browse], exception=java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to execute browse: entity type: DATASET, path: [], filters: null, start: 0, count: 10, locations=[SourceLocation{line=2, column=3}]}]
on frontend I get:
l
Did you run this using quickstart on your laptop or is this in your production environment?
If it is quickstart, you can use ‘datahub docker nuke’
b
im running frontend/gms on k8s and mysql and es on separate servers
it’s not a big deal to just kill the frontend/gms instances, I am just trying to understand how this works and whether this would be enough
nevermind, got it working eventually by deleting all indices in ES, truncating mysql tables and restarting gms.
m
Thanks for letting us know @bumpy-activity-74405. Technically restarting gms should not be needed.
Also @bumpy-activity-74405: we will think about how to make this easier for the future
👍 1
r
Hi, I just found this. Did it become easier? Is it "safe" to delete all ES indices and run the restore job? Or does the restore job do the deletion on its own?