In our Airbyte the MongoDB source connector is exp...
# ask-community-for-troubleshooting
r
In our Airbyte the MongoDB source connector is experiencing a critical issue where based on the log - saved resume tokens have timestamps earlier than initial resume tokens, resulting in missing data during CDC synchronization. Any cause as to why? Has anyone encountered this before?
k
Yes, this issue has been encountered and discussed by multiple users of the Airbyte MongoDB source connector. The core of the problem is that the saved resume token (used to continue Change Data Capture, or CDC, from the last processed event) sometimes has a timestamp that is earlier than the initial resume token generated at the start of a sync. This can result in missing data or sync failures, as the connector may attempt to resume from a point that is no longer available in the MongoDB oplog, or it may skip over new changes. Causes and Observations: 1. State Management Bug (Dual Resume Tokens): - There have been cases where Airbyte's saved state contains two resume tokens (e.g., one for the MongoDB URL and another for a server_id). Sometimes, one of these tokens does not update and becomes stale, while the other advances as expected. If the connector tries to resume from the stale token, it may reference a timestamp earlier than the current oplog window, causing sync failures or missing data. Users have reported that manually updating the stale token to match the up-to-date one can temporarily resolve the issue, but it recurs until the underlying bug is fixed. This was a known issue in earlier versions of the connector and has been addressed in version 2.0.4 and later, which consolidate the state to a single, up-to-date resume token and fix the dual-state bug [Failed syncs due to [config_error] Saved offset is not valid](https://github.com/airbytehq/airbyte/issues/62833). 2. Resume Token Update Logic: - In some connector versions, the resume token is only updated when new records are found in the collections being synced. If a collection is rarely updated, the resume token may lag behind, especially if other collections in the database are frequently updated. This can cause the connector to scan through a large number of irrelevant oplog entries, and if the oplog window is exceeded, the required resume point may be lost [Update resume token to latest oplog position even when no new records exist](https://github.com/airbytehq/airbyte/issues/48435). 3. Oplog Retention and Sync Frequency: - If the oplog retention period is too short or syncs are not frequent enough, the resume token may point to a timestamp that is no longer present in the oplog. This results in errors like "Saved offset is not valid" or "Resume token was not found." The recommended practice is to ensure the oplog retention is longer than the sync interval and that at least one sync occurs within the retention window that advances the resume token [MongoDB-v2 Troubleshooting](https://docs.airbyte.com/integrations/sources/mongodb-v2/mongodb-v2-troubleshooting#connector-limitations). 4. Connector Version and Fixes: - The dual-state bug and related resume token issues have been specifically addressed in connector version 2.0.4 and later. Users running older versions are strongly encouraged to upgrade. The fix ensures that only a single, up-to-date resume token is saved, preventing the connector from resuming from an outdated point [Failed syncs due to [config_error] Saved offset is not valid](https://github.com/airbytehq/airbyte/issues/62833). Summary: This issue is typically caused by bugs in state management (especially dual or stale resume tokens), insufficient oplog retention, or infrequent syncs. Upgrading to the latest connector version (2.0.4 or later) and ensuring proper oplog and sync configurations are the recommended solutions. If you are still seeing this after upgrading, it may be a new or related bug, and you should open a support ticket with detailed logs and state information. If you need more details or troubleshooting steps, let me know!
r
i am using version 2.0.4 @kapa.ai
k
A message from kapa.ai
If you are using version 2.0.4 of the Airbyte MongoDB source connector and are still experiencing issues where the saved resume token has a timestamp earlier than the initial resume token (leading to missing data during CDC sync), here’s what is known from the knowledge sources: 1. 2.0.4 Fixes the Dual-State Bug: Version 2.0.4 was specifically released to address the issue where multiple resume tokens (states) were being saved, and one could become stale, causing sync failures and data loss. Users have confirmed that after upgrading to 2.0.4, the state now contains only a single, up-to-date resume token, and the issue with outdated tokens was resolved for them. For example: > "I also confirm that the connection's saved state now contains a single result token... the resume token is a legit Debezium token (and the one used in the next sync)." > "We’ve been running version 2.0.4 of the connector in a DEV environment... Things have been working fine for 6 days straight so far. Great progress." [Failed syncs due to [config_error] Saved offset is not valid](https://github.com/airbytehq/airbyte/issues/62833) 2. Remaining Issues After 2.0.4: However, some users have reported that even after upgrading to 2.0.4, they still occasionally encounter the "Saved offset is not valid" error, especially in production environments with large oplog windows and frequent syncs. In these cases, the state appears correct (single, up-to-date token), but the error persists. Contributors have suggested that these remaining issues may be unrelated to the dual-state bug and could be due to oplog retention, sync frequency, or other environmental factors: > "I reviewed the logs, and your state looks correct, which suggests it’s not related to the issue fixed in v2.0.4... What is your current oplog retention? Did the previous sync load any data? When using CDC, it’s important to include a table with data movement so the LSN continues to advance and doesn’t get lost." [Failed syncs due to [config_error] Saved offset is not valid](https://github.com/airbytehq/airbyte/issues/62833) 3. Recommendations: - Ensure your oplog retention is sufficiently longer than your sync interval. - Make sure at least one sync occurs within the oplog window and that it processes new data, so the resume token advances. - If you are still seeing the issue with a single, up-to-date resume token in state, it may be a new or different bug. In this case, you are encouraged to open a new ticket with detailed logs and configuration information. Summary: Version 2.0.4 should resolve the issue of stale/dual resume tokens. If you are still experiencing resume tokens with earlier timestamps than the initial token, and your state only contains a single token, the knowledge sources suggest this is not the same bug and may be related to oplog retention or sync frequency. If the problem persists, further investigation with Airbyte support is recommended, as there is no explicit fix or root cause documented for this scenario in the provided sources. If you have more details (e.g., your oplog retention, sync frequency, or logs), sharing them may help narrow down the cause further.
r
@kapa.ai any places to investigate? ie the postgresql?
k
A message from kapa.ai
Based on the knowledge sources provided, there is no explicit mention of investigating the Airbyte internal PostgreSQL database (where Airbyte stores its state) as a root cause or troubleshooting step for the MongoDB resume token issue. The discussions and recommendations focus on: - The MongoDB oplog retention and sync frequency. - The state management within the Airbyte connector itself (i.e., how resume tokens are saved and updated). - Upgrading to connector version 2.0.4 to resolve known state/resume token bugs. - Ensuring that the connector is saving and using a single, up-to-date resume token. There is no documentation or user report in the sources suggesting that issues with Airbyte’s internal PostgreSQL database (such as state corruption or failed writes) are a common or known cause of the resume token problems you described. All troubleshooting and root cause analysis in the sources are focused on the connector logic, MongoDB oplog, and sync configuration. Summary: If you suspect a problem with Airbyte’s internal state storage (e.g., PostgreSQL), it is not covered in the provided knowledge sources. The recommended areas to investigate are the connector version, oplog retention, sync frequency, and the actual state saved by the connector itself [Failed syncs due to [config_error] Saved offset is not valid](https://github.com/airbytehq/airbyte/issues/62833) [MongoDB-v2 Troubleshooting](https://docs.airbyte.com/integrations/sources/mongodb-v2/mongodb-v2-troubleshooting#connector-limitations). If you need to go beyond these, you may need to open a support ticket or consult Airbyte’s engineering team for deeper investigation.
r
which table contains the connection state @kapa.ai
k
A message from kapa.ai
The connection state in Airbyte is stored in the
state
table of the internal configuration database. This table contains a
state
JSONB column that holds the state information for each connection, along with a
connection_id
column that links it to the corresponding connection. Other relevant columns include
id
,
created_at
,
updated_at
,
stream_name
,
namespace
, and
type
(which can be
STREAM
,
GLOBAL
, or
LEGACY
) [Airbyte Databases Data Catalog - state](https://docs.airbyte.com/platform/understanding-airbyte/database-data-catalog#state). You can query the state for a specific connection using:
Copy code
sql SELECT * FROM state WHERE connection_id = '<your-connection-id>';
This will return the state records associated with that connection.
r
@kapa.ai is there historical states recorded?
k
A message from kapa.ai
Yes, historical states are recorded in the Airbyte
state
table. The table includes
created_at
and
updated_at
timestamp columns, which indicate when each state record was created and last modified. This means you can see previous state records for a connection, not just the latest one. The presence of these timestamps and the structure of the table suggest that Airbyte keeps a history of state changes, rather than only storing the most recent state [Airbyte Databases Data Catalog - state](https://docs.airbyte.com/platform/understanding-airbyte/database-data-catalog#state). You can query the table to view historical states for a connection, for example:
Copy code
sql SELECT * FROM state WHERE connection_id = '<your-connection-id>' ORDER BY created_at DESC;
This will show you the sequence of state records over time for that connection.
r
@kapa.ai where are the logs stored?
k
A message from kapa.ai
Airbyte stores logs in different locations depending on your deployment and configuration: - By default (Open Source, local/Docker): Logs are stored in Docker volumes, specifically in the
airbyte_workspace
volume. You can browse these logs by mounting the volume in a container and navigating to the relevant job attempt directory (e.g.,
/data/9/2/
inside the container corresponds to
/tmp/workspace/9/2/
on the host) [Exploring Local Logs](https://docs.airbyte.com/platform/operator-guides/browsing-output-logs#exploring-local-logs). - Cloud Storage (S3, GCS, Minio): You can configure Airbyte to store logs in an S3 bucket, GCS bucket, or Minio by setting the appropriate environment variables (
S3_BUCKET
,
GCS_LOG_BUCKET
, etc.) in your deployment [Configuring Airbyte - Logging](https://docs.airbyte.com/platform/operator-guides/configuring-airbyte#logging). - In the UI: Logs for each sync job are accessible via the Airbyte UI under the connection's "Job History" tab. You can view, download, or copy links to these logs directly from the UI [Browsing logs](https://docs.airbyte.com/platform/operator-guides/browsing-output-logs). If you need to access logs outside the UI, you can do so by browsing the Docker volume or the configured cloud storage location.