Mauricio Pérez
05/16/2025, 4:22 PMsource-airtable
connector using pyairbyte
, but I’m running into an issue. Here are the details:
• Airbyte version: 0.24.2
• Python version: 3.10.17
• Error Message:
ERROR: Error starting the sync. This could be due to an invalid configuration or catalog.
Please contact Support for assistance.
Error: Validation against json schema defined in declarative_component_schema.yaml schema failed
AirbyteConnectorMissingSpecError: Connector did not return a spec.
Please review the log file for more information.
Connector Name: 'source-airtable'
This is the snippet it's throwing the issue:
import airbyte as ab
airtable = ab.get_source(
"source-airtable",
)
credentials = {
"credentials": {
"auth_method": "api_key",
"api_key": "pat"
}
}
airtable.set_config(config=credentials)
AJ Steers (Airbyte)
05/16/2025, 7:41 PMSource
objects now support the following methods:
• set_cursor_key()
- Overrides the cursor key for one stream.
• set_cursor_keys()
- Overrides the cursor key for any number of streams.
• set_primary_key()
- Overrides the primary key for one stream.
• set_primary_keys()
- Overrides the primary key for any number of stream.
See the updated API docs for more information.
Thanks to @Krishna for his help testing the new feature, and thanks also to @Mateusz Czarkowski for "upvoting" the issue here in the channel for our prioritization. 🙏Rad Extrem
05/19/2025, 2:40 AMget_source
or get_destination
?
Also, would using a local_executable
be a better approach for this use case? If so, are there any established steps or best practices for building such executables for connectors?Ben Wilen
05/19/2025, 5:14 PMSnowflakeCache
and currently am running into this auth error with a 5 hour sync:
sqlalchemy.exc.ProgrammingError: (snowflake.connector.errors.ProgrammingError) 390114 (08001): None: Authentication token has expired. The user must authenticate again.
(Background on this error at: <https://sqlalche.me/e/20/f405>)
Assuming it's because snowflake has a default timeout of 4 hours, does anyone have a fix for this? I don't see a way via PyAirbyte to specify "client_session_keep_alive": True
Ben Wilen
05/27/2025, 7:57 PMtable_name=table_prefix + stream_name
as the table name here. But SqlProcessorBase uses a normalizer in get_sql_table_name()
. As a result, if the stream name is not normalized in the state message (which I believe it isn't), the table_name we are actually inputting into the state table is not always the same as the actual table name. Is that intended (or am I misunderstanding the code)?Jay Stevens
06/05/2025, 6:39 PMMotherDuckCache
- but I don't think that the way incremental sync is setup will work unless I use a different cache for each account. Does that sound right?Nick Clarke
06/12/2025, 11:52 PMsource-appsflyer
. I am trying to test my changes as part of the service we have that uses pyairbyte in a docker container to read from this source and write back to bigquery as a destination. I've checked out https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-appsflyer and made my relevant changes there. I am now trying to test the changes in my service that uses pyairbyte with the following test code:
import airbyte as ab
import json
CONFIG_PATH = "configs/appsflyer_android.json"
with open(CONFIG_PATH, "r") as f:
source_config = json.load(f)
## /app/source_appsflyer/ is my local clone of airbytehq/airbyte/ mounting only the relevant appsflyer connector folder in my docker container.
source = ab.get_source("source-appsflyer", config=source_config, local_executable="/app/source_appsflyer/source_appsflyer/run.py")
source.select_streams(["retargeting_geo_report"])
all_streams = source.get_selected_streams()
read_result = source.read()
This fails, because if I pip install -r requirements.txt
from /app/source_appsflyer
I get package collisions for import airbyte
between airebyte and pyairbyte.
I then tried poetry install --with dev
which places an executable in /root/.cache/pypoetry/virtualenvs/source-appsflyer-OcVLBknA-py3.10/bin/source-appsflyer
which I then point to with
source = ab.get_source("source-appsflyer", config=source_config, local_executable="/root/.cache/pypoetry/virtualenvs/source-appsflyer-OcVLBknA-py3.10/bin/source-appsflyer")
But this appears to install the version from PyPI. When I make local changes to the package and do a fresh install, those changes do not appear in the lib
code under ``/root/.cache/pypoetry/virtualenvs/source-appsflyer-OcVLBknA-py3.10``
I also tried poetry build
from /app/source_appsflyer/
and then pointing source = ab.get_source("source-appsflyer", config=source_config, local_executable="/app/source_appsflyer/dist/name_of_the_whl_here")
This fails.
What is the proper way to do this? How can I point to an local executable or path that pyairbyte understands?Andrew Lytle
06/13/2025, 2:48 PMSlackbot
06/20/2025, 7:14 PMSlackbot
06/20/2025, 7:15 PMSlackbot
06/20/2025, 7:17 PMAJ Steers
06/20/2025, 7:20 PMNick Clarke
06/23/2025, 9:56 PMIdan Moradov
06/25/2025, 9:49 AMAlioune Amoussou
06/25/2025, 3:38 PMkey-pair authentication to Snowflake
when I came across this one PR.
It seems to handle authentication when the key is in a file but does not allow to pass the key directly as a string to SnowflakeConfig
in an attribute.
I was wondering what you think of this feature ? And whether, to implement my functionality, I should start from the branch of the existing PR or whether I should start from master.
There are several approaches I can take:
- Add a private_key attribute in SnowflakeConfig
here
- Add a private_key attribute and validation function in SnowflakeConfig
(ex: password can't be filled if private_key is...) here
- Abstract this logic into a Credential
class, which would contain all authentication attributes, handle validation and generate part of the configuration passed here.Yohann Jardin
06/27/2025, 1:01 PM_airbyte_state
and the state of a stream is missing from the table.
I shared the details on github.
It has a minimal impact for us, and we will soon not face it anymore.
I'm sharing it there in case other people face this in the future. We're not planning try to tackle it.
The fix looks easy, but testing against the different caches and their support of transaction or upsert doesn't seem trivial 😕Ben Wilen
06/27/2025, 5:18 PMAJ Steers (Airbyte)
06/28/2025, 7:36 PMNick Clarke
06/30/2025, 10:11 PMPyAirbyteNameNormalizationError
when I attempt to run a very simple example with the mixpanel connector, which seems like it may be an issue with the source? Please see https://gist.github.com/nickolasclarke/dd858ea3b4464e472f5a02ffbd4ce586 for more details.Nick Clarke
07/03/2025, 10:12 PMYohann Jardin
07/11/2025, 8:09 PMYohann Jardin
07/11/2025, 8:28 PMMauricio Pérez
07/18/2025, 6:45 PMsource-pipedrive
with Python 3.11 and running into this error:
Here’s the snippet triggering the error:Failure Reason: Encountered an error while discovering streams. Error: mutable default <class 'airbyte_cdk.sources.declarative.decoders.json_decoder.JsonDecoder'> for field decoder is not allowed: use default_factory
import airbyte as ab
pipedrive_config = {
"api_token": "api_token",
"replication_start_date": "2017-01-25 00:00:00Z"
}
pipedrive = ab.get_source("source-pipedrive", pip_url="airbyte-source-pipedrive==2.3.7")
pipedrive.set_config(pipedrive_config)
pipedrive.check()
I suspect this might be related to a CDK version incompatibility with Python 3.11. Has anyone found a workaround or a compatible version that resolves this?AJ Steers (Airbyte)
08/05/2025, 12:33 AMAJ Steers (Airbyte)
08/05/2025, 12:53 AMv0.29
introduces powerful new features, including MCP tools targeted to LLM use cases, the ability to "preview" data from multiple streams simultaneously, faster connector installs by leveraging the powerful uv
tool, and the ability to mix-and-match connectors' Python versions within the same environment.
>
> ✨ New Features (PyAirbyte Core)
> • feat: add stream previews for sources via Source.print_stream_previews()
and Source.get_stream_previews()
(#725)
> • feat: replace pip
with uv
for connector installations, resulting in dramatically faster connector installation (can be disabled with the AIRBYTE_NO_UV
environment variable) (#730)
> • feat: ability to override which Python versions will be used for installing new connectors with the use_python
arg for get_source()
and get_destination()
(#730)
> • feat: support uv-managed Python versions for installing new connectors with the use_python
arg, even for Python versions not yet installed on the system (#730)
> • feat: add new Cache.run_sql_query() method to run SQL queries directly against cache objects (#734)
>
> 🤖 New Built-in PyAirbyte MCP Server (🧪 Experimental)
> • feat: add new MCP tools which allow LLMs to call PyAirbyte directly (#734, #738, #736):
> ◦ Connector Management
> ▪︎ list_connectors
- List available Airbyte connectors with optional filtering by type (source/destination), install types (python, yaml, java, docker), or keywords
> ▪︎ get_connector_info
- Get documentation URL and information for a specific connector
> ▪︎ list_connector_config_secrets
- List available config secret names for a given connector
> ▪︎ validate_connector_config
- Validate a connector configuration
> ◦ Source Operations
> ▪︎ list_source_streams
- List all streams available in a source connector
> ▪︎ get_source_stream_json_schema
- Get the JSON schema for a specific stream in a source connector
> ▪︎ get_stream_previews
- Get sample records (previews) from streams in a source connector
> ▪︎ read_source_stream_records
- Read records from a specific source stream
> ◦ Cache Operations
> ▪︎ describe_default_cache
- Describe the currently configured default cache (typically DuckDB)
> ▪︎ list_cached_streams
- List all streams available in the default cache
> ▪︎ sync_source_to_cache
- Run a sync from a source connector to the default DuckDB cache
> ▪︎ run_sql_query
- Run SQL queries against the default cache
Please let us know what you think, here or in the GitHub Discussion - and join the MCP Webinar for more information on the latest with Airbyte AI and MCP.AJ Steers (Airbyte)
08/05/2025, 1:10 AMpip
._
⬆️ Scroll up for more detail on each of these big updates.AJ Steers (Airbyte)
08/05/2025, 4:28 AMaditya kumar
08/06/2025, 3:01 PMNick Clarke
08/08/2025, 1:01 AMColin
08/24/2025, 7:40 PMFile "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 992, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1004, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'source_declarative_manifest'