The issue you're experiencing—where some fields are not detected during schema discovery—is a known limitation of the Airbyte MongoDB connector. Airbyte uses
sampling to discover the schema: it inspects a configurable number of documents (by default, 10,000) to determine which fields exist. If a field is rare or only appears in a small subset of documents, it may not be included in the discovered schema if it isn't present in the sampled documents ([MongoDB v2 Troubleshooting](
https://docs.airbyte.com/integrations/sources/mongodb-v2/mongodb-v2-troubleshooting#connector-limitations)).
How to improve field detection: -
Increase the sample size: In the MongoDB connector settings, you can increase the "Document discovery sample size (Advanced)" to a higher value (up to 100,000). This increases the likelihood that rare fields will be detected, but will also make schema discovery take longer ([source](
https://github.com/airbytehq/airbyte/issues/42862)). -
Schema enforcement: If your collection has highly variable documents, Airbyte may still miss some fields. There is currently no built-in way to manually define or edit the schema in the UI. Disabling schema enforcement will sync all data as a blob, but this may not be suitable if you need structured fields in your destination ([MongoDB v2 Replication Methods](
https://docs.airbyte.com/integrations/sources/mongodb-v2#replication-methods)). -
Manual schema editing: Some advanced users have edited the catalog in the Airbyte database directly, but this is not officially supported or recommended ([source](
https://github.com/airbytehq/airbyte/issues/42862)).
Summary: - Increase the sample size in the connector settings to improve field detection. - If you need all fields reliably, consider restructuring your data so important fields are always present, or advocate for user-defined schema support (which is being discussed but not yet available). If you have further questions or need help adjusting the sample size, let me know!