Hi, I have read the codes for past few days, but s...
# getting-started
f
Hi, I have read the codes for past few days, but still confusing that how 's work of datahub if there is table rename(e.g. a hive table), anybody can help me ?
a
Table rename = delete old table + create new table
f
Hi, Nagarjuna. As my understanding, delete an old table will set the "status" aspect as removed when updating entities of neo4j, but I can't found the logic of that how to bind the existed relationship into new table(dataset)?
a
You are correct. GMA follows the soft delete model. So, if a dataset is removed then the status aspect is changed to "removed". And as such the metadata of the old dataset is not wiped off from the databse. A rename of the dataset refers to marking the old entity as removed and creating a new dataset. And, if the metadata of the old dataset can be carry forwarded to the new dataset, then appropriate MCEs need to be reemitted.
b
This is interesting. So this means that all related aspects will also need to be recreated. If they were created via push from different systems this may not be trivial. Maybe we need some mechanism to create a full copy of the entity + aspects in such cases...
Does this suggest that "name" should not be embedded in the Dataset entity URN?
Instead some unique uuid
a
The other way to look at this problem of renaming is ... create a rename api which does 1. delete old 2. create new copying all aspects from old
b
Hm. You cannot do this via MCE though
This would be API call
a
Yup .. if we were do this via MCE, then the dataset urn should be opaque like UUID
f
Thanks all. The UUID approach is something what the Apache Atlas is doing, and if I undertand the logic correctly that is Atlas hook client sending two messages to server at least instead of api call directly. The first one contains old and new table info both, so that the server backend is able to get the GUID(UUID) from old table(dataset), and set this GUID to the new table entity object(dataset). (Ps, I kick off this topic since I created my own metadata system based on the major source codes of Datahub,but I am not sure that whether there is a way to resolve this issue within Datahub or not before)
👀 1
b
It's good to bring this up. I think table renames are something DataHub should account for natively
Without some workaround
a
Yup, this is a general problem when u have names in the identifiers instead of opaque ids.
Applies as well for charts, dashboards, data process .. etc.., etc..,