Hi team, 1. In the new mongodb-v2 source connector...
# ask-community-for-troubleshooting
a
Hi team, 1. In the new mongodb-v2 source connector, if I have only specified one database, does debezium open changestream only for that DB, or for whole cluster 2. As per the source code it shouldn't, but I strongly doubt that. 3. The reason is, in debezium code, if a stream is opened at either database or collection level, there is a log
Change stream is restricted to <> database/collection
4. But I can't see this log anywhere in my job, there are other logs of debezium that are clearly shown in job logs My actual problem: 1. I sync few collections from a db in my mongo cluster every 2 hours to Bigquery, 2. The pipeline is breaking a lot since last few weeks, it just ends with this log
No records were returned by Debezium in the timeout seconds 150, closing the engine and iterator
3. Now I have another collection in same cluster, where very big documents are being pushed, like 700-800 kb document size average. but that collection is in a different DB. 4. So I wonder if that's the root cause behind this issue
k
Based on the available knowledge sources: 1. Does Debezium open a change stream only for the specified database, or for the whole cluster in MongoDB V2 source connector? The MongoDB V2 source connector (from version 2.0.0) supports syncing from multiple databases, and the documentation and changelog indicate that it can be configured to monitor the entire cluster or just specific databases, depending on your configuration. The connector requires the
readAnyDatabase
privilege for cluster-wide monitoring, but if you specify only one database in the
databases
array, it should only sync collections from that database. However, the documentation does not explicitly state whether the underlying Debezium engine opens a change stream at the cluster, database, or collection level when only one database is specified. There is no direct mention of the log message you referenced (“Change stream is restricted to <> database/collection”) in the provided documentation or issues, so I cannot confirm the exact Debezium behavior from these sources alone. The connector is designed to minimize required permissions when only a single database is configured, as of v2.0.2, but the granularity of the change stream is not explicitly documented in the sources provided [MongoDB V2 Changelog](https://docs.airbyte.com/integrations/sources/mongodb-v2#changelog). 2. About the log message and your suspicion: Your observation about the missing log line is valid, but the knowledge sources do not provide details about Debezium’s internal logging or how Airbyte surfaces those logs. There is no documentation confirming or denying the presence of that specific log message in Airbyte’s output. 3. Regarding your pipeline breaking with the timeout log: The log message
No records were returned by Debezium in the timeout seconds 150, closing the engine and iterator
is documented in Airbyte issues. This typically occurs when the connector does not find any new records to sync within the timeout window. This can be problematic if you are syncing rarely updated collections, as the connector may have to scan through a large number of irrelevant oplog entries, especially if other collections in the cluster are frequently updated. This behavior is a known operational issue and has been discussed in the context of resume token management and oplog scanning. The recommended solution is to ensure that the resume token is advanced even when no new records are found, to avoid repeated scanning of old oplog entries. This issue has been raised and discussed in the Airbyte community, and improvements have been suggested and implemented in newer connector versions [Update resume token to latest oplog position even when no new records exist](https://github.com/airbytehq/airbyte/issues/48435). Additionally, you should ensure that: - Your oplog retention is sufficient for your sync frequency (ideally at least a week). - You are running a recent version of the connector, as several oplog and resume token handling bugs have been addressed in recent releases [MongoDB V2 Changelog](https://docs.airbyte.com/integrations/sources/mongodb-v2#changelog). - Your Airbyte MongoDB user has the necessary permissions to read the oplog and, if needed, the
readAnyDatabase
privilege [Quick Start](https://docs.airbyte.com/integrations/sources/mongodb-v2#quick-start). Summary: - The connector should only sync from the specified database(s), but the exact scope of the Debezium change stream (cluster vs. database) is not explicitly documented in the provided sources. - The timeout log is a known issue, especially for rarely updated collections, and is related to how resume tokens and oplog scanning are handled. - Upgrading to the latest connector version and ensuring proper oplog retention and permissions are recommended steps. If you need more precise details about Debezium’s internal change stream scope, you may need to consult Debezium’s own documentation or enable more verbose logging, as the Airbyte documentation does not provide this level of detail.