On Google Sheets Sources, how do you setup an Incr...
# support
s
On Google Sheets Sources, how do you setup an Incremental or semi incremental synch into Google BigQuery? I can’t find the settings to modify this anywhere and my data is simply piling up every day in BigQuery instead of updating.
m
Hey There! 👋 Your message has been received by the RudderStack team. Our standard customer support hours are 9-6 PM EST, but we will forward this request to your Technical Account Manager, and they will get back to you shortly. Please use the thread for any additional comments.
q
@salmon-plastic-31303 Are you on the free tier plan?
s
I am on the growth plan.
q
What is the email associated with your workspace?
s
Workspace is jackson-steve
q
Can you provide an example of the behavior that you see? Is the same data being added to BQ multiple times even if there is no change?
For Incremental sync, RudderStack will sync only the new or modified data starting from the date specified in the Start Date RudderStack dashboard setting. For Semi-Incremental sync, RudderStack reads all data from the source and filters it to sync only the new or modified data starting from the date specified in the Start Date RudderStack dashboard setting. More information on sync modes can be found in this guide.
s
Yes. I’ve only had a small test sample of 11 and many duplicates of the same users (265 in BQ). I cleared the data earlier today and synched again then I get 11. Now when i synch again I have 22 which are all duplicated, I’ve read about those settings you mention but where do I adjust so that it is correct? I think it now must be set to full refresh.
q
I have been looking for Start Date field mentioned in the doc but it’s not available for Google sheet, let me check if google sheet supports this feature. Other extract sources have this option.
s
I’m going to try adding it again in case I added something wrong on the setup. I want to use the form being filled into GoogleSheets as an Identify call.
q
All cloud extract sources support incremental sync as long as it is supported by the source API, but
Google Sheets is an example where incremental sync is not supported
. That’s why we don’t have the start date field.
s
So its not possible to update a google sheet and have that data fed incrementally into BigQuery?
q
I am checking with the team, will notify you soon
s
ok, there is a started at and start date etc on set-up but it doesn’t allow me to specify “incremental” or “semi incremental” I believe it just defaults to refresh all. This means every time it synchs I get all the data from the sheet imported rather than the incremental data.
q
Checked with the team and Goggle sheet supports only Full sync. We are in the process of releasing a feature where you can override the data so you won’t see the duplicate data but it will still be a full sync.
the feature will be available next week
s
Ok, so how will that work? Is it checking BigQuery before importing?
or importing all the data and de-duping on the BigQuery side?
It relates to the reason I want to use this as we’re thinking we’d use this sheet as an Identify setup. Does this make sense?
q
It will write the new data to BQ and delete all data which is not from the latest sync run (with a couple of checks to ensure we are not deleting data that we did not write) So you will always have a unique set of data in BQ And it won’t change until you change something in google sheet
👍 1