Eric Van Benschoten
07/21/2025, 8:25 PMkapa.ai
07/21/2025, 8:25 PMStateDelegatingStream
declarative stream in Airbyte, you need to define a stream that can switch between a full refresh and an incremental sync mode based on the presence or absence of state. This is useful when you want to optimize data extraction by using incremental syncs when possible, but fall back to full refresh if no state is available. Here’s how you can define a StateDelegatingStream
in your connector’s YAML configuration: yaml streams: - type: StateDelegatingStream name: my_state_delegating_stream full_refresh_stream: # Define your full refresh DeclarativeStream here type: DeclarativeStream name: my_full_refresh_stream retriever: # retriever configuration for full refresh incremental_stream: # Define your incremental DeclarativeStream here type: DeclarativeStream name: my_incremental_stream retriever: # retriever configuration for incremental sync incremental_sync: # incremental sync configuration (e.g., DatetimeBasedCursor)
Key properties: - `name`: The name of your state delegating stream. - `full_refresh_stream`: A DeclarativeStream
definition used when no state is present. - `incremental_stream`: A DeclarativeStream
definition used when state is present (i.e., for incremental syncs). Note: - This component is marked as experimental, so use it with caution. - You can also use the $parameters
property to pass parameters to child components if needed. For more details, see the official documentation for the [StateDelegatingStream component](https://docs.airbyte.com/platform/connector-development/config-based/understanding-the-yaml-file/reference#/definitions/StateDelegatingStream) and its usage in declarative connectors. If you need a more concrete example or have a specific use case, please provide more details!Eric Van Benschoten
07/21/2025, 8:27 PMresolve: process "python /airbyte/integration_code/main.py spec" did not complete successfully: exit code: 1
Stdout:
{"type":"TRACE","trace":{"type":"ERROR","emitted_at":1753129267672,"error":{"message":"Error starting the sync. This could be due to an invalid configuration or catalog. Please contact Support for assistance. Error: Validation against json schema defined in declarative_component_schema.yaml schema failed","stack_trace":"Traceback (most recent call last):\n File \"/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py\", line 394, in _validate_source\n validate(self._source_config, self._declarative_component_schema)\n File \"/usr/local/lib/python3.11/site-packages/jsonschema/validators.py\", line 1121, in validate\n raise error\njsonschema.exceptions.ValidationError: 'StateDelegatingStream' is not one of ['DeclarativeStream']\n\nFailed validating 'enum' in schema[0]['properties']['type']:\n {'enum': ['DeclarativeStream'], 'type': 'string'}\n\nOn instance['type']:\n 'StateDelegatingStream'\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File \"/airbyte/integration_code/source_declarative_manifest/run.py\", line 95, in _get_local_yaml_source\n return SourceLocalYaml(config=config, catalog=catalog, state=state)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File \"/airbyte/integration_code/source_declarative_manifest/run.py\", line 72, in __init__\n super().__init__(\n File \"/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/yaml_declarative_source.py\", line 34, in __init__\n super().__init__(\n File \"/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/concurrent_declarative_source.py\", line 93, in __init__\n super().__init__(\n File \"/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py\", line 138, in __init__\n self._validate_source()\n File \"/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py\", line 396, in _validate_source\n raise ValidationError(\njsonschema.exceptions.ValidationError: Validation against json schema defined in declarative_component_schema.yaml schema failed\n"}}}
Stderr:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 394, in _validate_source
validate(self._source_config, self._declarative_component_schema)
File "/usr/local/lib/python3.11/site-packages/jsonschema/validators.py", line 1121, in validate
raise error
jsonschema.exceptions.ValidationError: 'StateDelegatingStream' is not one of ['DeclarativeStream']
Failed validating 'enum' in schema[0]['properties']['type']:
{'enum': ['DeclarativeStream'], 'type': 'string'}
On instance['type']:
'StateDelegatingStream'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/airbyte/integration_code/main.py", line 4, in <module>
run()
File "/airbyte/integration_code/source_declarative_manifest/run.py", line 236, in run
handle_command(args)
File "/airbyte/integration_code/source_declarative_manifest/run.py", line 87, in handle_command
handle_local_manifest_command(args)
File "/airbyte/integration_code/source_declarative_manifest/run.py", line 118, in handle_local_manifest_command
source = _get_local_yaml_source(args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/airbyte/integration_code/source_declarative_manifest/run.py", line 114, in _get_local_yaml_source
raise error
File "/airbyte/integration_code/source_declarative_manifest/run.py", line 95, in _get_local_yaml_source
return SourceLocalYaml(config=config, catalog=catalog, state=state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/airbyte/integration_code/source_declarative_manifest/run.py", line 72, in __init__
super().__init__(
File "/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/yaml_declarative_source.py", line 34, in __init__
super().__init__(
File "/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/concurrent_declarative_source.py", line 93, in __init__
super().__init__(
File "/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 138, in __init__
self._validate_source()
File "/usr/local/lib/python3.11/site-packages/airbyte_cdk/sources/declarative/manifest_declarative_source.py", line 396, in _validate_source
raise ValidationError(
jsonschema.exceptions.ValidationError: Validation against json schema defined in declarative_component_schema.yaml schema failed
kapa.ai
07/21/2025, 8:27 PMkapa.ai
07/21/2025, 8:27 PM'StateDelegatingStream'
is not accepted for the type
property in your stream definition. The schema expects only 'DeclarativeStream'
as a valid value for `type`: 'StateDelegatingStream' is not one of ['DeclarativeStream'] Failed validating 'enum' in schema[0]['properties']['type']: {'enum': ['DeclarativeStream'], 'type': 'string'} On instance['type']: 'StateDelegatingStream'
According to the Airbyte documentation, the StateDelegatingStream
is a valid component, but it is not meant to be used as the top-level stream type in your YAML. Instead, your top-level stream should always have type: DeclarativeStream
, and you can use StateDelegatingStream
as a component within the stream definition, such as in the retriever
or other nested properties. The schema for streams only allows DeclarativeStream
as the type at the top level, not StateDelegatingStream
[Understanding the YAML file](https://docs.airbyte.com/platform/connector-development/config-based/understanding-the-yaml-file/yaml-overview). How to fix: - Change your stream definition so that the top-level type
is DeclarativeStream
. - Use StateDelegatingStream
as a component within the stream, for example as the retriever
. Example: yaml streams: - type: DeclarativeStream name: my_stream retriever: type: StateDelegatingStream name: my_state_delegating_stream full_refresh_stream: # your full refresh DeclarativeStream config incremental_stream: # your incremental DeclarativeStream config
This structure matches the schema and should resolve your validation error. If you need more details on the schema, you can review the [declarative_component_schema.yaml](https://github.com/airbytehq/airbyte-python-cdk/blob/main/airbyte_cdk/sources/declarative/declarative_component_schema.yaml) as referenced in the documentation [Component Schema Reference](https://docs.airbyte.com/platform/connector-development/config-based/advanced-topics/component-schema-reference). If you continue to have issues, please share your YAML snippet for more targeted help.Eric Van Benschoten
07/22/2025, 6:38 PMkapa.ai
07/22/2025, 6:38 PMkapa.ai
07/22/2025, 6:38 PMcursor_field
property to the path of your chosen field. Note: The field must be present in every record, and for best results, it should be unique or at least not have many duplicate values. If you need a YAML example or more details for a specific connector, let me know!Eric Van Benschoten
07/22/2025, 6:39 PMkapa.ai
07/22/2025, 6:39 PMkapa.ai
07/22/2025, 6:39 PMDatetimeBasedCursor
component. Instead, you configure the stream_cursor_field
property in your stream definition to point to your chosen field (such as an integer or string), and Airbyte will use that field for incremental syncs. Here’s a reference snippet from the YAML schema for streams: yaml DeclarativeStream: type: DeclarativeStream name: my_stream retriever: # retriever configuration here stream_cursor_field: my_id_field # This can be a string or an array of strings for nested fields
- `stream_cursor_field`: The field of the records being read that will be used during checkpointing. This can be a string (for a top-level field) or an array of strings (for nested fields). Note: - For non-datetime cursors, you do not specify an incremental_sync
block with DatetimeBasedCursor
. - The incremental sync logic will use the value of stream_cursor_field
to track progress. You can find this in the YAML reference documentation for the declarative low-code framework [Understanding the YAML file](https://docs.airbyte.com/platform/connector-development/config-based/understanding-the-yaml-file/yaml-overview): > The stream object is represented in the YAML file as: > > > DeclarativeStream: > ... > stream_cursor_field: > definition: The field of the records being read that will be used during checkpointing > anyOf: > - type: string > - type: array > items: > - type: string >
If you need to see a full example or have a specific use case, let me know!Eric Van Benschoten
07/22/2025, 6:44 PMkapa.ai
07/22/2025, 6:44 PMkapa.ai
07/22/2025, 6:44 PMINT64
(integer) or `STRING`—as the cursor field for incremental syncs. The connector tracks the maximum cursor value after each sync and resumes from there in subsequent syncs. The only requirements are that the cursor field must be monotonically increasing and non-null for all records [BigQuery Supported Sync Modes](https://docs.airbyte.com/integrations/sources/bigquery#supported-sync-modes). However, in the context of Airbyte's declarative YAML low-code framework, the official documentation and YAML reference primarily describe incremental syncs using the DatetimeBasedCursor
component, which is specifically for datetime fields [Understanding the YAML file](https://docs.airbyte.com/platform/connector-development/config-based/understanding-the-yaml-file/yaml-overview). There is no explicit documentation in the provided sources about configuring incremental syncs with non-datetime cursor fields (like integers or strings) using the declarative YAML approach. Summary: - Yes, incremental syncs with non-datetime cursors are supported in some connectors (like BigQuery) via the UI and connector code. - The declarative YAML framework's documented incremental sync support is focused on datetime-based cursors. There is no explicit YAML reference for non-datetime incremental cursors in the provided documentation. If you need to implement incremental syncs with a non-datetime cursor in a declarative YAML connector, the knowledge sources do not provide enough information to confirm this is currently supported.Eric Van Benschoten
07/22/2025, 10:10 PMkapa.ai
07/22/2025, 10:10 PMkapa.ai
07/22/2025, 10:10 PMDatetimeBasedCursor
for datetime-based incremental syncs, but does not mention a DeclarativeCursor
or provide details about its usage or when to use it. If you meant another cursor type (such as DatetimeBasedCursor
or using a cursor field for incremental syncs), please clarify, and I can provide information based on the available documentation.