many-guitar-67205
04/11/2022, 8:26 AMdatahub delete
output is ambiguous, and the data is not gone:
❯ datahub delete --entity_type dataset --platform kafka --hard
This will permanently delete data from DataHub. Do you want to continue? [y/N]: y
[2022-04-11 10:17:22,059] INFO {datahub.cli.delete_cli:200} - datahub configured with <http://localhost:8080>
[2022-04-11 10:17:22,182] INFO {datahub.cli.delete_cli:212} - Filter matched 22 entities. Sample: ['urn:li:dataset:(urn:li:dataPlatform:kafka,
... (22 urns)
]
This will delete 22 entities. Are you sure? [y/N]: y
100% (22 of 22) |################################################################################################################################################################################################################| Elapsed Time: 0:00:01 Time: 0:00:01
Took 6.673 seconds to hard delete 0 rows for 22 entities
the gms debug log shows 22 successful delete actions, but the output of the command says 0 rows
The data is not deleted.
What can I do to
a. troubleshoot this further
b. actually delete the data
?many-guitar-67205
04/11/2022, 8:30 AMdatahub ingest rollback
for that run, I see a similar effect:
This rollback deleted 0 entities and rolled back 0 aspects
showing first 100 of 0 aspects reverted by this run
followed by a list of 100 aspects.early-lamp-41924
04/11/2022, 4:26 PMmany-guitar-67205
04/12/2022, 9:14 AMTook 7.678 seconds to soft delete -1 rows for 22 entities
Notice the -1
The main difference now is that they no longer appear in the UI, and in the graphql api, I can see that status.removed=true
Of course, now I can not hard delete them anymore using --platform kafka
Hard deleting them one by one seems to work though.