Hi, another newbie question here :joy: Is there a ...
# ingestion
n
Hi, another newbie question here ๐Ÿ˜‚ Is there a way to automatically upsert metadata, detecting only the changed part? I'm trying to ingest bigquery metadata via datahub-rest. Since several people are using the same project, it is hard to know exactly which part of the dataset is modified and when. What I want is to only update the changed part eventhough i don't define anything (like, specific table... ) in the recipe. (using airflow or etc.) Optimally, whenever change occurs in the data source, i want datahub to automatically upsert the change. Is there a way i can do this ? ๐Ÿ™‚
s
Hi @nice-planet-17111 , I solved this task with a periodic ingestion: every day I have an Airflow DAG which runs the ingestion. This action will replace all the old content with the new, so users will not notice anything changed, if nothing changed in the original tables
n
@stale-jewelry-2440 Hi Vincenzo, thank you for the reply. I think my explanation was insufficient. I'm trying to reduce the time for every ingestion by 1) not ingesting unchanged content and 2) only ingesting changed content. If i understood correctly, the way you tried will also ingest (replace) the current (unchanged) content.
It takes around 10 minutes for me to ingest the whole contents and i don't think this is efficient ๐Ÿ˜ž
l
You are in luck! We recently added support for stateful ingestion and talked about it in the last townhall (cc @little-megabyte-1074 to ping this thread when we upload the townhall videos). Which sources are you ingesting? cc @helpful-optician-78938
๐Ÿ‘ 1
๐Ÿคฉ 1
We will start with the usage and profiling sources and will roll out support for different sources over time
l
Hello, @nice-planet-17111! Iโ€™ve uploaded the full TownHall video - hereโ€™s a link specifically to the Stateful Ingestion overview

https://www.youtube.com/watch?v=nQDiKPKnLLQ&t=1100sโ–พ

n
@loud-island-88694 @little-megabyte-1074 Thank you for sharing the video, and that's great to hear! ๐Ÿ™‚ I'm trying to ingest bigquery & mysql metadata. Is this currently supported? If it isn't, when will it be supported?
l
not yet. It involves looking at the audit log which is easier in BigQuery - we will add them to our backlog and keep you posted. till then, I'm assuming you are ok doing full ingestion. cc @helpful-optician-78938
n
@loud-island-88694 Thank you for your support ! ๐Ÿ™‚