Hi, is there an official way to backup and restore...
# all-things-deployment
a
Hi, is there an official way to backup and restore data from one DataHub instance to another? The way i’m thinking about is from this page: • snapshot the tables from the existing instance
datahub.metadata_aspect_v2
• restore tables to new instance • run restore indices on new instance Would be nice if there was a backup / restore high-level command.
o
There is, it's just not well documented at this time and has limited support: https://github.com/datahub-project/datahub/blob/master/datahub-upgrade/src/main/java/com/linkedin/datahub/upgrade/restorebackup/RestoreBackup.java It's executed in a similar way to RestoreIndices (id: RestoreBackup instead).
a
thanks @orange-night-91387! will check it out.
I see that the env file for the upgrade container says it requires access to Kafka, do you know if we also have to pass in overrides for topic names, in case we are using non-default values?
o
Are you deploying with docker-compose or k8s? If you're using the helm charts then it should pull the topic names from the same place. If you're deploying the containers manually and setting their config individually you'll need to override the same values for the upgrade container
a
yeah, unfortunately i can’t use k8s, i’ve ported all this over to ECS via CloudFormation.
thanks for the heads up, i’ll feed those env vars for topic names in as well
So for this Parquet backup reader, i’d need to produce a parquet file of all the aspects?
o
Yeah it's intended to target RDS backup which can output parquet to be stored in S3
r
I am also looking to just export all metadata (including users/domains) and re-import on a nuked datahub.