<@U04SE163WC8> I am having the issue about replica...
# ask-ai
h
@kapa.ai I am having the issue about replication start date when connecting and extraction data from universal analytics with BigQuery. When I chose the start date at 2020-01-01 on airbyte UI and started extraction process, everything works fine, however, the data in my destination always start at 2022-03-28
k
It seems like you're experiencing an issue where the data in your BigQuery destination starts at a different date than the replication start date you set in the Airbyte UI. This issue has been reported by other users as well. Here are a few steps and considerations based on the information available: 1. Time Zone Differences: One possible reason for the discrepancy could be time zone differences. As mentioned in [Github Issue #28384](https://github.com/airbytehq/airbyte/issues/28384), a user experienced a similar issue where the earliest date in BigQuery was two days earlier than the replication start date set in Airbyte. This could be due to time zone differences between the source and destination. 2. Cursor Field Ambiguity: Another potential issue could be related to the cursor field used for incremental syncs. In [Forum Post #3552](https://discuss.airbyte.io/t/3552), a user mentioned that using the
ga_date
field as the cursor could lead to ambiguity and missing rows. If you are using
ga_date
as the cursor, you might want to consider this aspect. 3. API Endpoint for Dynamic Dates: If you need to dynamically adjust the replication start date, you can use the "Update Source" endpoint of the Airbyte API to programmatically pass new replication start dates. This was suggested in [Forum Post #2878](https://discuss.airbyte.io/t/2878). Here is the link to the API documentation for updating the source: [Update Source API](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update). ### Steps to Troubleshoot and Resolve: 1. Check Time Zones: Verify the time zones of your Universal Analytics data and your BigQuery destination. Adjust the replication start date accordingly to account for any differences. 2. Review Cursor Field: If you are using
ga_date
as the cursor field, consider if this might be causing the issue. You might need to experiment with different cursor fields or sync modes. 3. Use API for Dynamic Dates: If you need to adjust the replication start date dynamically, use the Airbyte API to update the source configuration. If these steps do not resolve your issue, you might want to open a new issue on the Airbyte GitHub repository or seek further assistance on the Airbyte community forums. For more detailed troubleshooting, you can refer to the following sources: - [Github Issue #28384](https://github.com/airbytehq/airbyte/issues/28384) - [Forum Post #3552](https://discuss.airbyte.io/t/3552) - [Forum Post #2878](https://discuss.airbyte.io/t/2878) If you need further assistance, please provide more details about your configuration and any relevant logs.
h
@kapa.ai give me step by step to check in your recommend 1. Check Time Zones: Verify the time zones of your Universal Analytics data and your BigQuery destination. Adjust the replication start date accordingly to account for any differences. 2. Review Cursor Field: If you are using
ga_date
as the cursor field, consider if this might be causing the issue. You might need to experiment with different cursor fields or sync modes.
k
A message from kapa.ai
Sure, here are the step-by-step instructions to check and address the issues related to time zones and cursor fields: ### 1. Check Time Zones #### Step-by-Step Instructions: 1. Identify the Time Zone of Your Universal Analytics Data: - Log in to your Google Analytics account. - Navigate to the Admin section. - Under the "View" column, click on "View Settings." - Note the "Time Zone Country or Territory" and "Time Zone" settings. 2. Identify the Time Zone of Your BigQuery Dataset: - Log in to your Google Cloud Console. - Navigate to BigQuery. - Select the dataset where your data is being stored. - Check the time zone settings of your dataset. BigQuery typically stores timestamps in UTC by default. 3. Adjust the Replication Start Date: - Calculate the time difference between the time zones of your Universal Analytics data and your BigQuery dataset. - Adjust the replication start date in the Airbyte UI to account for this time difference. - For example, if your Universal Analytics data is in PST (UTC-8) and your BigQuery dataset is in UTC, you might need to set the replication start date 8 hours earlier. ### 2. Review Cursor Field #### Step-by-Step Instructions: 1. Check the Current Cursor Field: - Log in to your Airbyte instance. - Navigate to the connection settings for your Universal Analytics to BigQuery sync. - Check the cursor field being used. If it is set to
ga_date
, note this setting. 2. Experiment with Different Cursor Fields: - If
ga_date
is causing issues, consider using a different cursor field that might be more reliable. - Unfortunately, the specific fields available depend on your Universal Analytics setup. Common alternatives might include session IDs or other timestamp fields. 3. Change the Cursor Field in Airbyte: - In the Airbyte UI, go to the connection settings. - Change the cursor field to the new field you want to experiment with. - Save the settings and run a new sync to see if the issue is resolved. 4. Test Different Sync Modes: - If changing the cursor field does not resolve the issue, you might need to experiment with different sync modes. - In the Airbyte UI, try switching between "Incremental | Append" and "Incremental | Deduped History" to see which mode works best for your data. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552)
Sure, here are the step-by-step instructions to check and address the issues related to time zones and cursor fields: ### 1. Check Time Zones #### Step-by-Step Instructions: 1. Identify the Time Zone of Your Universal Analytics Data: - Log in to your Google Analytics account. - Navigate to the Admin section. - Under the "View" column, click on "View Settings." - Note the "Time Zone Country or Territory" and "Time Zone" settings. 2. Identify the Time Zone of Your BigQuery Dataset: - Log in to your Google Cloud Console. - Navigate to BigQuery. - Select the dataset where your data is being stored. - Check the time zone settings of your dataset. BigQuery typically stores timestamps in UTC by default. 3. Adjust the Replication Start Date: - Calculate the time difference between the time zones of your Universal Analytics data and your BigQuery dataset. - Adjust the replication start date in the Airbyte UI to account for this time difference. - For example, if your Universal Analytics data is in PST (UTC-8) and your BigQuery dataset is in UTC, you might need to set the replication start date 8 hours earlier. ### 2. Review Cursor Field #### Step-by-Step Instructions: 1. Check the Current Cursor Field: - Log in to your Airbyte instance. - Navigate to the connection settings for your Universal Analytics to BigQuery sync. - Check the cursor field being used. If it is set to
ga_date
, note this setting. 2. Experiment with Different Cursor Fields: - If
ga_date
is causing issues, consider using a different cursor field that might be more reliable. - Unfortunately, the specific fields available depend on your Universal Analytics setup. Common alternatives might include session IDs or other timestamp fields. 3. Change the Cursor Field in Airbyte: - In the Airbyte UI, go to the connection settings. - Change the cursor field to the new field you want to experiment with. - Save the settings and run a new sync to see if the issue is resolved. 4. Test Different Sync Modes: - If changing the cursor field does not resolve the issue, you might need to experiment with different sync modes. - In the Airbyte UI, try switching between "Incremental | Append" and "Incremental | Deduped History" to see which mode works best for your data. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) By following these steps, you should be able to identify and address the issues related to time zones and cursor fields in your Universal Analytics to BigQuery sync. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai give me step by step to view timezone on dataset in Bigquery, I can not find it
k
A message from kapa.ai
It appears that BigQuery datasets themselves do not have a specific time zone setting. Instead, BigQuery stores all timestamps in UTC by default. However, you can view and convert timestamps to your desired time zone using SQL queries. Here are the steps to view and convert timestamps in BigQuery: ### Step-by-Step Instructions to View and Convert Timezone in BigQuery 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). - Navigate to the BigQuery section. 2. Open Your Dataset: - In the BigQuery console, find and select your dataset from the left-hand navigation pane. 3. Write a Query to View Timestamps: - Open the query editor. - Write a query to select the timestamp fields from your table. For example:
Copy code
sql      SELECT timestamp_field      FROM `your_project.your_dataset.your_table`      LIMIT 10;
4. Convert Timestamps to Your Desired Time Zone: - Use the
TIMESTAMP
functions to convert the timestamps to your desired time zone. For example, to convert to Pacific Time (PST), you can use the following query:
Copy code
sql      SELECT        timestamp_field,        TIMESTAMP(timestamp_field, 'America/Los_Angeles') AS timestamp_pst      FROM `your_project.your_dataset.your_table`      LIMIT 10;
5. Run the Query: - Click the "Run" button to execute the query and view the results. ### Example Query Here is an example query that converts a timestamp field to Pacific Time (PST):
Copy code
sql SELECT   timestamp_field,   TIMESTAMP(timestamp_field, 'America/Los_Angeles') AS timestamp_pst FROM `your_project.your_dataset.your_table` LIMIT 10;
### Additional Resources - [BigQuery TIMESTAMP Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions) - [Google Cloud Console](https://console.cloud.google.com/) By following these steps, you can view and convert timestamps in your BigQuery dataset to your desired time zone. If you need further assistance, please refer to the BigQuery documentation or reach out to the Google Cloud support team.
h
@kapa.ai my desired timezone is Vancouver, Canada,
k
A message from kapa.ai
To convert timestamps in your BigQuery dataset to the Vancouver, Canada time zone, you can follow these steps. Vancouver is in the Pacific Time Zone, so you can use the
America/Vancouver
time zone identifier in your SQL queries. ### Step-by-Step Instructions to View and Convert Timezone in BigQuery 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). - Navigate to the BigQuery section. 2. Open Your Dataset: - In the BigQuery console, find and select your dataset from the left-hand navigation pane. 3. Write a Query to View Timestamps: - Open the query editor. - Write a query to select the timestamp fields from your table. For example:
Copy code
sql      SELECT timestamp_field      FROM `your_project.your_dataset.your_table`      LIMIT 10;
4. Convert Timestamps to Vancouver Time Zone: - Use the
TIMESTAMP
functions to convert the timestamps to the Vancouver time zone. You can use the
America/Vancouver
time zone identifier. Here is an example query:
Copy code
sql      SELECT        timestamp_field,        TIMESTAMP(timestamp_field, 'America/Vancouver') AS timestamp_vancouver      FROM `your_project.your_dataset.your_table`      LIMIT 10;
5. Run the Query: - Click the "Run" button to execute the query and view the results. ### Example Query Here is an example query that converts a timestamp field to the Vancouver time zone:
Copy code
sql SELECT   timestamp_field,   TIMESTAMP(timestamp_field, 'America/Vancouver') AS timestamp_vancouver FROM `your_project.your_dataset.your_table` LIMIT 10;
### Additional Resources - [BigQuery TIMESTAMP Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions) - [Google Cloud Console](https://console.cloud.google.com/) By following these steps, you can view and convert timestamps in your BigQuery dataset to the Vancouver, Canada time zone. If you need further assistance, please refer to the BigQuery documentation or reach out to the Google Cloud support team.
h
@kapa.ai I got this error when running your query Unrecognized name: timestamp_field
k
A message from kapa.ai
It looks like the error is due to the fact that
timestamp_field
is a placeholder and not the actual name of your timestamp column. You need to replace
timestamp_field
with the actual name of the timestamp column in your table. Here’s how you can modify the query: ### Step-by-Step Instructions to View and Convert Timezone in BigQuery 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). - Navigate to the BigQuery section. 2. Open Your Dataset: - In the BigQuery console, find and select your dataset from the left-hand navigation pane. 3. Identify Your Timestamp Column: - If you are unsure of the column names, you can run a simple query to list all columns:
Copy code
sql      SELECT *      FROM `your_project.your_dataset.your_table`      LIMIT 1;
- This will give you an overview of the columns in your table. 4. Write a Query to Convert Timestamps to Vancouver Time Zone: - Replace
timestamp_field
with the actual name of your timestamp column. For example, if your timestamp column is named
created_at
, the query would be:
Copy code
sql      SELECT        created_at,        TIMESTAMP(created_at, 'America/Vancouver') AS timestamp_vancouver      FROM `your_project.your_dataset.your_table`      LIMIT 10;
5. Run the Query: - Click the "Run" button to execute the query and view the results. ### Example Query Here is an example query that converts a timestamp field named
created_at
to the Vancouver time zone:
Copy code
sql SELECT   created_at,   TIMESTAMP(created_at, 'America/Vancouver') AS timestamp_vancouver FROM `your_project.your_dataset.your_table` LIMIT 10;
### Additional Resources - [BigQuery TIMESTAMP Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions) - [Google Cloud Console](https://console.cloud.google.com/) By following these steps and ensuring you use the correct column name, you should be able to view and convert timestamps in your BigQuery dataset to the Vancouver, Canada time zone. If you need further assistance, please refer to the BigQuery documentation or reach out to the Google Cloud support team.
h
@kapa.ai ok, I the timestamp placeholder and run this query SELECT _airbyte_extracted_at, TIMESTAMP(_airbyte_extracted_at, 'America/Vancouver') AS timestamp_vancouver FROM
my-project.dataset.devices
LIMIT 10 and I got this error No matching signature for function TIMESTAMP for argument types: TIMESTAMP, STRING. Supported signatures: TIMESTAMP(STRING, [STRING]); TIMESTAMP(DATE, [STRING]); TIMESTAMP(DATETIME, [STRING]); TIMESTAMP(TIMESTAMP) at [3:3]
k
A message from kapa.ai
It looks like the error is due to the incorrect usage of the
TIMESTAMP
function. In BigQuery, to convert a timestamp to a different time zone, you should use the
TIMESTAMP
function in combination with
AT TIME ZONE
. Here’s how you can modify your query to correctly convert the timestamp to the Vancouver time zone: ### Step-by-Step Instructions to Convert Timezone in BigQuery 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). - Navigate to the BigQuery section. 2. Open Your Dataset: - In the BigQuery console, find and select your dataset from the left-hand navigation pane. 3. Write a Query to Convert Timestamps to Vancouver Time Zone: - Use the
TIMESTAMP
function with
AT TIME ZONE
to convert the timestamp. Here is the corrected query:
Copy code
sql      SELECT        _airbyte_extracted_at,        _airbyte_extracted_at AT TIME ZONE 'America/Vancouver' AS timestamp_vancouver      FROM `my-project.dataset.devices`      LIMIT 10;
4. Run the Query: - Click the "Run" button to execute the query and view the results. ### Example Query Here is the example query that converts the
_airbyte_extracted_at
timestamp to the Vancouver time zone:
Copy code
sql SELECT   _airbyte_extracted_at,   _airbyte_extracted_at AT TIME ZONE 'America/Vancouver' AS timestamp_vancouver FROM `my-project.dataset.devices` LIMIT 10;
### Additional Resources - [BigQuery TIMESTAMP Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions) - [Google Cloud Console](https://console.cloud.google.com/) By following these steps and using the correct syntax, you should be able to convert timestamps in your BigQuery dataset to the Vancouver, Canada time zone. If you need further assistance, please refer to the BigQuery documentation or reach out to the Google Cloud support team.
h
@kapa.ai I got another error: Syntax error: Expected end of input but got keyword AT at [11:25] when running
Copy code
SELECT
  _airbyte_extracted_at,
  _airbyte_extracted_at AT TIME ZONE 'America/Vancouver' AS timestamp_vancouver
FROM `my-project.dataset.devices`
LIMIT 10;
k
A message from kapa.ai
It looks like there was a mistake in the previous query. BigQuery does not support the
AT TIME ZONE
syntax directly. Instead, you should use the
FORMAT_TIMESTAMP
function to convert the timestamp to a specific time zone. Here’s how you can modify your query to correctly convert the timestamp to the Vancouver time zone using `FORMAT_TIMESTAMP`: ### Step-by-Step Instructions to Convert Timezone in BigQuery 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). - Navigate to the BigQuery section. 2. Open Your Dataset: - In the BigQuery console, find and select your dataset from the left-hand navigation pane. 3. Write a Query to Convert Timestamps to Vancouver Time Zone: - Use the
FORMAT_TIMESTAMP
function to convert the timestamp. Here is the corrected query:
Copy code
sql      SELECT        _airbyte_extracted_at,        FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', _airbyte_extracted_at, 'America/Vancouver') AS timestamp_vancouver      FROM `my-project.dataset.devices`      LIMIT 10;
4. Run the Query: - Click the "Run" button to execute the query and view the results. ### Example Query Here is the example query that converts the
_airbyte_extracted_at
timestamp to the Vancouver time zone:
Copy code
sql SELECT   _airbyte_extracted_at,   FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', _airbyte_extracted_at, 'America/Vancouver') AS timestamp_vancouver FROM `my-project.dataset.devices` LIMIT 10;
### Additional Resources - [BigQuery TIMESTAMP Functions](https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions) - [Google Cloud Console](https://console.cloud.google.com/) By following these steps and using the correct syntax, you should be able to convert timestamps in your BigQuery dataset to the Vancouver, Canada time zone. If you need further assistance, please refer to the BigQuery documentation or reach out to the Google Cloud support team.
h
@kapa.ai ok, it works now, and this is my query result Row _airbyte_extracted_at timestamp_vancouver 1 2024-05-27 072848.546000 UTC 2024-05-27 002848 2 2024-05-27 072848.546000 UTC 2024-05-27 002848 3 2024-05-27 072848.868000 UTC 2024-05-27 002848 4 2024-05-27 072848.868000 UTC 2024-05-27 002848 5 2024-05-27 072849.216000 UTC 2024-05-27 002849 6 2024-05-27 072849.216000 UTC 2024-05-27 002849 7 2024-05-27 072849.513000 UTC 2024-05-27 002849 8 2024-05-27 072849.513000 UTC 2024-05-27 002849 9 2024-05-27 072849.840000 UTC 2024-05-27 002849 10 2024-05-27 072849.840000 UTC 2024-05-27 002849. how can I fix my root issue ?
k
A message from kapa.ai
To address your root issue of the replication start date discrepancy between Universal Analytics and BigQuery, follow these steps: ### Step-by-Step Instructions 1. Verify Time Zones: - You have already verified that the timestamps are correctly converted to the Vancouver time zone. This ensures that the time zone difference is not causing the issue. 2. Review Cursor Field: - Ensure that the cursor field used for incremental syncs is appropriate. If you are using
ga_date
or another field, consider if this might be causing the issue. 3. Adjust Replication Start Date: - Given that your data in BigQuery starts at a different date than expected, you might need to adjust the replication start date in the Airbyte UI to account for any discrepancies. ### Adjusting the Replication Start Date 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Set the Replication Start Date: - Adjust the replication start date to a date earlier than the one you initially set. For example, if you initially set it to
2020-01-01
and the data starts at
2022-03-28
, try setting it to
2019-01-01
or earlier to ensure all data is captured. 4. Save and Run the Sync: - Save the settings and run a new sync to see if the issue is resolved. ### Example Query to Verify Data After adjusting the replication start date, you can run a query to verify that the data in BigQuery now starts from the correct date:
Copy code
sql SELECT   _airbyte_extracted_at,   FORMAT_TIMESTAMP('%Y-%m-%d %H:%M:%S', _airbyte_extracted_at, 'America/Vancouver') AS timestamp_vancouver FROM `my-project.dataset.devices` ORDER BY _airbyte_extracted_at LIMIT 10;
### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) By following these steps, you should be able to address the root issue of the replication start date discrepancy. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai for 2. Review Cursor Field: - Ensure that the cursor field used for incremental syncs is appropriate. If you are using
ga_date
or another field, consider if this might be causing the issue.. Suggest me a solution if I dont use ga_date
k
A message from kapa.ai
If you are experiencing issues with using
ga_date
as the cursor field for incremental syncs, you can consider using alternative fields that might be more reliable. Here are some steps and suggestions to help you choose and configure a different cursor field: ### Step-by-Step Instructions to Review and Change Cursor Field 1. Identify Alternative Cursor Fields: - Review the schema of your Universal Analytics data to identify other potential fields that can be used as a cursor. Common alternatives might include: -
session_id
-
event_timestamp
-
created_at
- Any other timestamp or unique identifier field that increments over time. 2. Check the Data Consistency: - Ensure that the alternative field you choose is consistent and reliable for incremental syncs. It should ideally be a field that increases monotonically (i.e., it always moves forward in time). 3. Change the Cursor Field in Airbyte: - Log in to your Airbyte instance. - Navigate to the connection settings for your Universal Analytics to BigQuery sync. - Change the cursor field to the new field you have identified. For example, if you choose
event_timestamp
, update the cursor field setting to use this field. 4. Save and Run the Sync: - Save the settings and run a new sync to see if the issue is resolved. ### Example Configuration If you decide to use
event_timestamp
as the new cursor field, here is how you might configure it: 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Update the Cursor Field: - Change the cursor field from
ga_date
to
event_timestamp
. 4. Save and Run the Sync: - Save the settings and run a new sync. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) By following these steps and choosing a reliable alternative cursor field, you should be able to address the issues related to the replication start date and ensure that your incremental syncs work correctly. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai If I dont have ideal cursor value such as
session_id
-
event_timestamp
-
created_at.
Can I use a default _airbyte_raw_id ?
k
A message from kapa.ai
Using
_airbyte_raw_id
as a cursor field is generally not recommended because it is a UUID assigned by Airbyte to each event and does not represent a sequential or time-based order. This means it won't help in determining the order of records for incremental syncs. However, if you don't have an ideal cursor field like
session_id
,
event_timestamp
, or
created_at
, you can consider the following alternatives: ### Alternatives to Using
_airbyte_raw_id
1. Combine Multiple Fields: - If no single field is ideal, you might be able to combine multiple fields to create a composite cursor. For example, you could concatenate
event_date
and
event_time
if they exist separately. 2. Use a Synthetic Timestamp: - If your data source allows, you could create a synthetic timestamp field that combines date and time information from other fields. 3. Fallback to Full Refresh: - If incremental syncs are not feasible due to the lack of a suitable cursor field, you might need to use the "Full Refresh" sync mode. This mode re-syncs all data each time, which can be resource-intensive but ensures data completeness. ### Example of Combining Multiple Fields If you have separate
event_date
and
event_time
fields, you can create a composite cursor:
Copy code
sql SELECT   CONCAT(event_date, ' ', event_time) AS composite_cursor FROM `your_project.your_dataset.your_table` LIMIT 10;
### Steps to Configure Composite Cursor in Airbyte 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Update the Cursor Field: - Change the cursor field to the composite cursor you have created. For example, if you created a
composite_cursor
field, use this as the cursor. 4. Save and Run the Sync: - Save the settings and run a new sync. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) By following these steps and considering alternatives to
_airbyte_raw_id
, you should be able to find a more suitable cursor field for your incremental syncs. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai this is my schema table Filter Field name Type Mode Key Collation Default Value Policy Tags Description STRINGREQUIRED----- TIMESTAMPREQUIRED----- JSONREQUIRED----- DATENULLABLE----- STRINGNULLABLE----- INTEGERNULLABLE----- NUMERICNULLABLE----- STRINGNULLABLE----- STRINGNULLABLE----- INTEGERNULLABLE----- INTEGERNULLABLE----- BOOLEANNULLABLE----- NUMERICNULLABLE----- NUMERICNULLABLE----- NUMERICNULLABLE----- INTEGERNULLABLE. Help me find the best value for cursor field
k
A message from kapa.ai
Based on the schema you provided, here are some suggestions for selecting the best cursor field for incremental syncs: ### Schema Fields -
STRING
(REQUIRED) -
TIMESTAMP
(REQUIRED) -
JSON
(REQUIRED) -
DATE
(NULLABLE) -
STRING
(NULLABLE) -
INTEGER
(NULLABLE) -
NUMERIC
(NULLABLE) -
STRING
(NULLABLE) -
STRING
(NULLABLE) -
INTEGER
(NULLABLE) -
INTEGER
(NULLABLE) -
BOOLEAN
(NULLABLE) -
NUMERIC
(NULLABLE) -
NUMERIC
(NULLABLE) -
NUMERIC
(NULLABLE) -
INTEGER
(NULLABLE) ### Recommended Cursor Field Given the available fields, the
TIMESTAMP
field is the most suitable candidate for a cursor field. This is because it is a time-based field that can help in determining the order of records for incremental syncs. ### Steps to Configure the Cursor Field in Airbyte 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Update the Cursor Field: - Change the cursor field to the
TIMESTAMP
field. 4. Save and Run the Sync: - Save the settings and run a new sync. ### Example Configuration If your
TIMESTAMP
field is named
event_timestamp
, here is how you might configure it: 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Update the Cursor Field: - Change the cursor field to
event_timestamp
. 4. Save and Run the Sync: - Save the settings and run a new sync. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) By following these steps and using the
TIMESTAMP
field as the cursor, you should be able to set up a reliable incremental sync. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai no, this is type of _airbyte_extracted_at
k
A message from kapa.ai
Given that
_airbyte_extracted_at
is of type
TIMESTAMP
, it is indeed a suitable candidate for a cursor field. This field represents the time when the event was pulled from the data source, which can be used to track incremental changes. ### Steps to Configure
_airbyte_extracted_at
as the Cursor Field in Airbyte 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Update the Cursor Field: - Change the cursor field to
_airbyte_extracted_at
. 4. Save and Run the Sync: - Save the settings and run a new sync. ### Example Configuration Here is how you might configure
_airbyte_extracted_at
as the cursor field: 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Update the Cursor Field: - Change the cursor field to
_airbyte_extracted_at
. 4. Save and Run the Sync: - Save the settings and run a new sync. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) By following these steps and using
_airbyte_extracted_at
as the cursor, you should be able to set up a reliable incremental sync. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai explain for me the connection state in connections Settings: [ { "streamDescriptor": { "name": "devices" }, "streamState": { "ga_date": "2023-10-04" } }, { "streamDescriptor": { "name": "weekly_active_users" }, "streamState": { "ga_date": "2023-10-04" } } ]
k
A message from kapa.ai
### Understanding Connection State in Airbyte The connection state in Airbyte provides additional information about incremental syncs. It includes the most recent values for the global or stream-level cursors, which can aid in debugging or determining which data will be included in the next sync. #### Example Connection State Here is an example of a connection state configuration:
Copy code
json {   "streamDescriptor": {     "name": "devices"   },   "streamState": {     "ga_date": "2023-10-04"   } }, {   "streamDescriptor": {     "name": "weekly_active_users"   },   "streamState": {     "ga_date": "2023-10-04"   } }
#### Explanation - streamDescriptor: This object contains the name of the stream. In this example, there are two streams:
devices
and
weekly_active_users
. - streamState: This object contains the state information for the stream. In this example, the state is represented by the
ga_date
field, which indicates the last date for which data was successfully synced. #### How It Works 1. Incremental Syncs: The connection state helps Airbyte determine where to resume the next sync. For example, if the last synced date for the
devices
stream is
2023-10-04
, the next sync will start from
2023-10-05
to avoid reprocessing data. 2. Debugging: The connection state can be useful for debugging issues with data syncs. If there is a problem, you can check the state to see the last successful sync point. 3. Manual Updates: You can manually update the connection state if needed. This can be helpful if you want to re-sync data from a specific point in time without performing a full resync. #### Editing Connection State To edit the connection state in the Airbyte UI: 1. In the Airbyte UI, click Connections and then click the connection you want to display. 2. Click the Settings tab on the Connection page. 3. Click the Advanced dropdown arrow. 4. Click anywhere in the Connection state to start editing. 5. Confirm changes by clicking "Update state". Discard any changes by clicking "Revert changes". Note: Updates to connection state should be handled with extreme care. Updates may break your syncs, requiring a full historical sync of your data to fix. Make changes only as directed by the Airbyte team. For more detailed information, you can refer to the [Airbyte documentation on managing connection state](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-connection-state). By understanding and managing the connection state, you can ensure more efficient and reliable data syncs in Airbyte.
h
@kapa.ai if I modify the ga_date to 2020-01-01 to make it start to replicate start date at 2020-01-01, will it solve the root issue
k
A message from kapa.ai
Modifying the
ga_date
in the connection state to
2020-01-01
can help in re-syncing data from that specific start date. However, this approach should be handled with caution as it might lead to a full re-sync of data, which can be resource-intensive. ### Steps to Modify the Connection State 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Edit the Connection State: - Click on the Settings tab on the Connection page. - Click the Advanced dropdown arrow. - Click anywhere in the Connection state to start editing. 4. Modify the `ga_date`: - Change the
ga_date
value to
2020-01-01
for the relevant streams. For example:
Copy code
json      [        {          "streamDescriptor": {            "name": "devices"          },          "streamState": {            "ga_date": "2020-01-01"          }        },        {          "streamDescriptor": {            "name": "weekly_active_users"          },          "streamState": {            "ga_date": "2020-01-01"          }        }      ]
5. Save the Changes: - Confirm changes by clicking "Update state". Discard any changes by clicking "Revert changes". 6. Run the Sync: - Run a new sync to start replicating data from
2020-01-01
. ### Considerations - Resource-Intensive: Modifying the
ga_date
to an earlier date will trigger a re-sync of all data from that date, which can be resource-intensive and time-consuming. - Data Duplication: Ensure that your sync mode is set to handle potential data duplication. Using "Incremental | Deduped History" can help manage duplicates. - Testing: It is advisable to test this change on a smaller dataset or in a staging environment before applying it to your production data. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) - [Airbyte Documentation on Managing Connection State](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-connection-state) By following these steps, you can modify the
ga_date
to re-sync data from
2020-01-01
. This should help address the root issue of missing data from the desired start date. If you continue to experience problems, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai I want to retrieve historical data using the Universal Analytics source. However, even though I set the start_date to 2016, it only displays data from 2019 onwards. How can I resolve this issue?
k
A message from kapa.ai
To retrieve historical data using the Universal Analytics source and ensure that data from your desired start date (e.g., 2016) is included, you can follow these steps: ### Steps to Retrieve Historical Data 1. Verify API Limitations: - Google Analytics API has limitations on how far back you can retrieve data. Ensure that the API allows fetching data from the year 2016. If there are limitations, you might need to adjust your expectations or use alternative methods to retrieve older data. 2. Set the Replication Start Date: - In the Airbyte UI, set the replication start date to
2016-01-01
. This ensures that the sync process starts from the desired date. 3. Modify the Connection State: - If the data still starts from 2019, you might need to manually adjust the connection state to force a re-sync from the desired start date. ### Example of Modifying the Connection State 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Edit the Connection State: - Click on the Settings tab on the Connection page. - Click the Advanced dropdown arrow. - Click anywhere in the Connection state to start editing. 4. Modify the `ga_date`: - Change the
ga_date
value to
2016-01-01
for the relevant streams. For example:
Copy code
json
     [
       {
         "streamDescriptor": {
           "name": "devices"
         },
         "streamState": {
           "ga_date": "2016-01-01"
         }
       },
       {
         "streamDescriptor": {
           "name": "weekly_active_users"
         },
         "streamState": {
           "ga_date": "2016-01-01"
         }
       }
     ]
5. Save the Changes: - Confirm changes by clicking "Update state". Discard any changes by clicking "Revert changes". 6. Run the Sync: - Run a new sync to start replicating data from
2016-01-01
. ### Additional Considerations - API Rate Limits: Be aware of the API rate limits and quotas. Fetching a large amount of historical data might hit these limits, so plan your syncs accordingly. - Data Completeness: Ensure that the data from 2016 onwards is complete and accurate. You might need to perform additional checks or validations to confirm this. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) - [Airbyte Documentation on Managing Connection State](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-connection-state)
To retrieve historical data using the Universal Analytics source and ensure that data from your desired start date (e.g., 2016) is included, you can follow these steps: ### Steps to Retrieve Historical Data 1. Verify API Limitations: - Google Analytics API has limitations on how far back you can retrieve data. Ensure that the API allows fetching data from the year 2016. If there are limitations, you might need to adjust your expectations or use alternative methods to retrieve older data. 2. Set the Replication Start Date: - In the Airbyte UI, set the replication start date to
2016-01-01
. This ensures that the sync process starts from the desired date. 3. Modify the Connection State: - If the data still starts from 2019, you might need to manually adjust the connection state to force a re-sync from the desired start date. ### Example of Modifying the Connection State 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Edit the Connection State: - Click on the Settings tab on the Connection page. - Click the Advanced dropdown arrow. - Click anywhere in the Connection state to start editing. 4. Modify the `ga_date`: - Change the
ga_date
value to
2016-01-01
for the relevant streams. For example:
Copy code
json      [        {          "streamDescriptor": {            "name": "devices"          },          "streamState": {            "ga_date": "2016-01-01"          }        },        {          "streamDescriptor": {            "name": "weekly_active_users"          },          "streamState": {            "ga_date": "2016-01-01"          }        }      ]
5. Save the Changes: - Confirm changes by clicking "Update state". Discard any changes by clicking "Revert changes". 6. Run the Sync: - Run a new sync to start replicating data from
2016-01-01
. ### Additional Considerations - API Rate Limits: Be aware of the API rate limits and quotas. Fetching a large amount of historical data might hit these limits, so plan your syncs accordingly. - Data Completeness: Ensure that the data from 2016 onwards is complete and accurate. You might need to perform additional checks or validations to confirm this. ### Additional Resources - [Airbyte Documentation on Google Analytics Setup](https://docs.airbyte.com/integrations/sources/google-analytics-v4#setup-guide) - [Airbyte API Documentation for Updating Source](https://airbyte-public-api-docs.s3.us-east-2.amazonaws.com/rapidoc-api-docs.html#post-/v1/sources/update) - [Forum Post on Cursor Field Issues](https://discuss.airbyte.io/t/3552) - [Airbyte Documentation on Managing Connection State](https://docs.airbyte.com/cloud/managing-airbyte-cloud/manage-connection-state) By following these steps, you should be able to retrieve historical data from 2016 using the Universal Analytics source. If you continue to experience issues, consider reaching out to the Airbyte community for further assistance.
h
@kapa.ai Google Analytics API has limitations on how far back you can retrieve data. Ensure that the API allows fetching data from the year 2016. If there are limitations, you might need to adjust your expectations or use alternative methods to retrieve older data.How can I retrieve those old data
k
A message from kapa.ai
To retrieve historical data from Google Analytics, especially if you are facing limitations with the API, you can consider the following approaches: ### 1. Verify API Limitations First, ensure that the Google Analytics API allows fetching data from the year 2016. Google Analytics APIs have certain limitations and quotas, and it's essential to verify these before proceeding. - Google Analytics Reporting API: Check the [Google Analytics Reporting API documentation](https://developers.google.com/analytics/devguides/reporting/core/v4) for any limitations on historical data retrieval. ### 2. Use Google Analytics 360 If you are using the standard version of Google Analytics, you might face limitations on the amount of historical data you can retrieve. Google Analytics 360 offers higher quotas and might allow you to fetch older data. - Upgrade to Google Analytics 360: Consider upgrading to Google Analytics 360 if you need to retrieve a large amount of historical data. ### 3. Use the Google Analytics User Interface If the API does not allow you to fetch data from 2016, you can manually export the data using the Google Analytics user interface. #### Steps to Export Data Manually: 1. Log in to Google Analytics: - Go to [Google Analytics](https://analytics.google.com/analytics/web/). 2. Navigate to the Reporting View: - Select the view that contains the data you want to export. 3. Set the Date Range: - Set the date range to include the period from 2016. 4. Export the Data: - Use the export options available in the Google Analytics interface to export the data in CSV, Excel, or other formats. ### 4. Use Google BigQuery If you have linked your Google Analytics account to BigQuery, you can query historical data directly from BigQuery. #### Steps to Query Data from BigQuery: 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). 2. Navigate to BigQuery: - Select BigQuery from the left-hand navigation pane. 3. Write a Query to Retrieve Historical Data: - Use SQL queries to fetch data from the desired date range. For example:
Copy code
sql
     SELECT *
     FROM `your_project.your_dataset.ga_sessions_*`
     WHERE _TABLE_SUFFIX BETWEEN '20160101' AND '20161231'
### 5. Use Airbyte for Historical Data Sync If you are using Airbyte, ensure that the replication start date is set correctly and that the connection state is adjusted to fetch data from the desired start date. #### Steps to Adjust Replication Start Date in Airbyte: 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Set the Replication Start Date: - Set the replication start date to
2016-01-01
. 4. Modify the Connection State: - Edit the connection state to ensure it starts from
2016-01-01
.
To retrieve historical data from Google Analytics, especially if you are facing limitations with the API, you can consider the following approaches: ### 1. Verify API Limitations First, ensure that the Google Analytics API allows fetching data from the year 2016. Google Analytics APIs have certain limitations and quotas, and it's essential to verify these before proceeding. - Google Analytics Reporting API: Check the [Google Analytics Reporting API documentation](https://developers.google.com/analytics/devguides/reporting/core/v4) for any limitations on historical data retrieval. ### 2. Use Google Analytics 360 If you are using the standard version of Google Analytics, you might face limitations on the amount of historical data you can retrieve. Google Analytics 360 offers higher quotas and might allow you to fetch older data. - Upgrade to Google Analytics 360: Consider upgrading to Google Analytics 360 if you need to retrieve a large amount of historical data. ### 3. Use the Google Analytics User Interface If the API does not allow you to fetch data from 2016, you can manually export the data using the Google Analytics user interface. #### Steps to Export Data Manually: 1. Log in to Google Analytics: - Go to [Google Analytics](https://analytics.google.com/analytics/web/). 2. Navigate to the Reporting View: - Select the view that contains the data you want to export. 3. Set the Date Range: - Set the date range to include the period from 2016. 4. Export the Data: - Use the export options available in the Google Analytics interface to export the data in CSV, Excel, or other formats. ### 4. Use Google BigQuery If you have linked your Google Analytics account to BigQuery, you can query historical data directly from BigQuery. #### Steps to Query Data from BigQuery: 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). 2. Navigate to BigQuery: - Select BigQuery from the left-hand navigation pane. 3. Write a Query to Retrieve Historical Data: - Use SQL queries to fetch data from the desired date range. For example:
Copy code
sql      SELECT *      FROM `your_project.your_dataset.ga_sessions_*`      WHERE _TABLE_SUFFIX BETWEEN '20160101' AND '20161231'
### 5. Use Airbyte for Historical Data Sync If you are using Airbyte, ensure that the replication start date is set correctly and that the connection state is adjusted to fetch data from the desired start date. #### Steps to Adjust Replication Start Date in Airbyte: 1. Log in to Airbyte: - Open your Airbyte instance and log in. 2. Navigate to the Connection Settings: - Go to the connection settings for your Universal Analytics to BigQuery sync. 3. Set the Replication Start Date: - Set the replication start date to
2016-01-01
. 4. Modify the Connection State: - Edit the connection state to ensure it starts from
2016-01-01
.
Copy code
json    [      {        "streamDescriptor": {          "name": "devices"        },        "streamState": {          "ga_date": "2016-01-01"        }      },      {        "streamDescriptor": {          "name": "weekly_active_users"        },        "streamState": {          "ga_date": "2016-01-01"        }      }    ]
5. Run the Sync: - Save the settings
h
@kapa.ai how can I use Use Google Analytics 360 to get historical data from universal analytics to bigQuery ?
k
A message from kapa.ai
To use Google Analytics 360 to get historical data from Universal Analytics to BigQuery, follow these steps: ### Step-by-Step Guide #### 1. Upgrade to Google Analytics 360 If you are not already using Google Analytics 360, you will need to upgrade. Google Analytics 360 offers higher data limits and additional features, including the ability to export data to BigQuery. - Contact Google Sales: Reach out to Google Sales to upgrade your account to Google Analytics 360. #### 2. Link Google Analytics 360 to BigQuery Once you have Google Analytics 360, you can link it to BigQuery to export your historical data. 1. Log in to Google Analytics: - Go to [Google Analytics](https://analytics.google.com/analytics/web/). 2. Navigate to Admin Settings: - Click on the Admin gear icon at the bottom left of the page. 3. Link BigQuery: - Under the Property column, click on All Products. - Find BigQuery and click on Link BigQuery. - Follow the prompts to link your Google Analytics 360 property to your BigQuery project. #### 3. Configure BigQuery Export 1. Set Up Export: - In the BigQuery linking process, you will be asked to configure the export settings. Ensure that you select the correct BigQuery project and dataset where you want the data to be exported. 2. Historical Data Export: - Google Analytics 360 allows you to backfill historical data. During the setup, you can specify the date range for the historical data you want to export. #### 4. Verify Data in BigQuery 1. Log in to Google Cloud Console: - Go to [Google Cloud Console](https://console.cloud.google.com/). 2. Navigate to BigQuery: - Select BigQuery from the left-hand navigation pane. 3. Check the Exported Data: - Verify that the data from the specified date range has been exported to your BigQuery dataset. You can run SQL queries to check the data.
Copy code
sql    SELECT *    FROM `your_project.your_dataset.ga_sessions_*`    WHERE _TABLE_SUFFIX BETWEEN '20160101' AND '20161231'
### Additional Resources - [Google Analytics 360 Overview](https://marketingplatform.google.com/about/analytics/360/) - [BigQuery Export for Google Analytics 360](https://support.google.com/analytics/answer/3437618?hl=en) - [Google Cloud Console](https://console.cloud.google.com/) By following these steps, you can use Google Analytics 360 to export historical data from Universal Analytics to BigQuery. This will allow you to analyze your historical data using BigQuery's powerful querying capabilities.
h
@kapa.ai if Google Analytics Reporting API have data retention that keeps data only 14 month ?
k
A message from kapa.ai
Yes, the Google Analytics Reporting API has data retention policies that can limit the amount of historical data you can access. Specifically, for standard Google Analytics accounts, the data retention period is typically 14 months. This means that you may not be able to retrieve data older than 14 months using the API. ### Data Retention in Google Analytics - Standard Google Analytics: Data retention is typically limited to 14 months. - Google Analytics 360: Offers extended data retention options, allowing you to access data beyond the standard 14-month period. ### Documentation and Resources - [Google Analytics Data Retention](https://support.google.com/analytics/answer/7667196?hl=en): This article provides detailed information on data retention settings in Google Analytics. - [Google Analytics Reporting API](https://developers.google.com/analytics/devguides/reporting/core/v4): Official documentation for the Google Analytics Reporting API. ### How to Retrieve Historical Data Beyond 14 Months If you need to access data older than 14 months, consider the following options: 1. Upgrade to Google Analytics 360: This premium version offers extended data retention and additional features, including the ability to export data to BigQuery. 2. Use BigQuery Export: If you have Google Analytics 360, you can link your Google Analytics account to BigQuery and export historical data. This allows you to store and query data beyond the 14-month retention period. #### Steps to Link Google Analytics 360 to BigQuery 1. Log in to Google Analytics: - Go to [Google Analytics](https://analytics.google.com/analytics/web/). 2. Navigate to Admin Settings: - Click on the Admin gear icon at the bottom left of the page. 3. Link BigQuery: - Under the Property column, click on All Products. - Find BigQuery and click on Link BigQuery. - Follow the prompts to link your Google Analytics 360 property to your BigQuery project. 4. Set Up Export: - Configure the export settings to specify the date range for the historical data you want to export. 5. Verify Data in BigQuery: - Log in to [Google Cloud Console](https://console.cloud.google.com/). - Navigate to BigQuery and verify that the data from the specified date range has been exported to your dataset. By following these steps, you can access and analyze historical data beyond the standard 14-month retention period using Google Analytics 360 and BigQuery.