Hello everyone, The ingestion via YAML file will a...
# ingestion
s
Hello everyone, The ingestion via YAML file will add all the metadata from our Datawarehouse into Datahub. However, it will not delete the tables that have been dropped from the Datawarehouse. Is there a way of ingesting the new data as well as deleting the deprecated ones ? Thank you so much !
h
Hi @salmon-angle-92685 Some of the ingestion sources support deleting the stale metadata via stateful ingestion
s
Interesting! Thank you so much. I have another question. We have a lot of developers that are often adding more and more metadata, as tags, Glossary Terms, etc... If a run an ingestion each week, for example, will the already tagged tables be reset ? Will I lose the already tagged metadata ? If the approach you sent me really works, I would like to keep ingesting the new tables, dropping the old ones, but not touching those already metadated. Thank you in advance 🙂 @hundreds-photographer-13496
h
Hi @salmon-angle-92685, are your users adding more tags and glossary terms via the UI? If yes, they are separate aspects(EditableTags etc) from what the ingestion populates. So you should not be losing them if you re-ingest. Regarding your question related to stale entity removal, which ingestion source are you using so that we can check if it supports stateful ingestion. Thanks!
s
We are using S3, Redshift and Snowflake @helpful-optician-78938 🙂
h
You should be safely able to re-ingest via these sources then without losing the metadata added via UI.
cc: @bulky-soccer-26729, @big-carpet-38439
b
hey @salmon-angle-92685! it looks like those sources you're pointing out are CLI-based sources. In order to execute from the UI or set up schedules you'll have to define the source through the UI