Hello Guys,
Is there a way to clean the persistent volumes on the K8S deployment (similar to the 'datahub docker nuke' command)?
gentle-camera-33498
06/20/2022, 1:57 PM
Case:
I deleted some entities like corpGroup and domains directly from the MySQL database, but it seems that this data still exists on the persistent volumes of elasticsearch/neo4j. It would be great if I could clean all volumes and restore all indices.
Question: How is the best way to do this cleanup task in K8S? For newer versions isnt better to clean everything and force indices restoring job?
l
late-zoo-31017
06/20/2022, 9:52 PM
I am interested in this, too!!
b
better-orange-49102
06/22/2022, 3:18 AM
i would just run the reindexing job. take note that the mySQL DB is not entirely the single source of truth for datahub, there are some information in ES that cannot be found in MySQL (for instance, usage_events, data profiles etc). reindexing will make sure that the information in mySQL and ES are aligned.
g
gentle-camera-33498
06/22/2022, 12:49 PM
In my case, when I deleted the groups and glossary terms directly in MySQL, they continued to exist even after executing the reindex job.
I don't have a great knowledge of Elasticsearch, but it seems to me that the ideal to maintain consistency between MySQL and the indexes would be to roll over and rewrite all the recors to avoid that there are still documents that were deleted in MySQL.