I tried deleting datasets in Datahub using ```data...
# troubleshoot
r
I tried deleting datasets in Datahub using
Copy code
datahub delete --env DEV
but I got this error
Copy code
HTTPError: 404 Client Error: Not Found for url: https://<datahub-endpoint>//entities?action=search
Could this be because I have SSO turned on?
m
@red-pizza-28006: I wonder if the multiple
/
is throwing us off. Can you set the datahub host to something without a trailing slash?
r
I tried that already, and still encountered the same error
I manually curled the URL and do get a response though
m
you’re running the latest datahub?
r
0.8.16.4
m
what about the server?
r
the server would be at
tag: "v0.8.16"
m
ok..
one sec
try this with
curl -X POST "http://<datahub_endpoint>/entities?action=search" -d @search_dev.json
are you hitting the frontend API or the metadata-service API
the cli I believe assumes you are connecting to metadata-service (that typically runs on :8080)
r
I was hitting the frontend that also runs on port 8080
should I be using the GMS endpoint?
m
yeah use the gms endpoint
I think we recently added a proxy to gms at
/api/gms
but I don’t think it is in a release yet
r
sweet, when I configured
datahub init
, i gave the datahub host as the front end, but when i give the GMS endpoint, and try this, it works fine
m
cool! we’ll make sure to add some self-checks to the cli
thank you 1
btw was anything working before when you pointed it to the frontend?
or did you set this up just to test deletes
r
no it wasn’t working. I used the CLI to do regular ingestion, but in that case the sink is already provided in the config file..so probably didnt matter
One thing that I want to point out here is that even after running delete, I still see some entities in the UI
For e:g After running this command
Copy code
datahub delete --env DEV
m
yeah, there are a few stray server-side bugs we are aware of .. that haven’t made it to the latest release (but are fixed on HEAD)
r
i see, cool glad that they are being worked on. 🙂
m
can you refresh the page
stray should not mean 658 datasets are still sitting around 😄
r
😄 unfortunately they are. See this for e:g, the previous pagination still exists even though no data sets
m
what happens when you go back to the home page and browse down
r
thats what I did
m
hmm
are you familiar with getting into elastic and looking at indices?
r
yep
m
can you check if you find any of those entity ids still in the index (
datasetindex_v2_<timestamp>
)
b
Somehow these browse paths are not being removed from the index. cc @early-lamp-41924 any ideas come to mind?
e
Just to clarify. browsepaths are not being removed but everything else is?
@green-football-43791 do we set the same runId to the default aspects we create?
tho for deleting with @mammoth-bear-12532’s new --env flag, it should always delete the whole entity not aspects right?
r
yep, they do. @mammoth-bear-12532