Airbyte #help-connector-development

מילכה כץ

09/29/2025, 12:34 PM

Hi, I’m interested in contributing to the Google Search Console connector issue #40522 and noticed the previous maintainer left. Could someone guide me who currently maintains connector development or who could review PRs for that issue?

dilan silva

10/02/2025, 3:16 PM

I'm building low code Python connector for our company, now I'm running CI Tests and Its always faling in Python CLI smoke test using PyAirbyte. I will provide you more details, hope someone can help me.

Copy code

Dev Environment
------------------------
OS Windows, using WSL and docker with Airbyte abctl.

Metadata.yaml

Copy code

------------------------------------------
Metadata.yaml
-------------------------------------------
data:
  allowedHosts:
    hosts:
      - TODO # Please change to the hostname of the source.
  registries:
    oss:
      enabled: true
    cloud:
      enabled: false
  remoteRegistries:
    pypi:
      enabled: true
      packageName: airbyte-source-nexus-datasets
  connectorBuildOptions:
      baseImage: docker.io/airbyte/python-connector-base:3.0.2@sha256:73697fbe1c0e2ebb8ed58e2268484bb4bfb2cb56b653808e1680cbc50bafef75
  connectorSubtype: api
  connectorType: source
  definitionId: 9e1fe63c-80ad-44fe-8927-10e66c9e209b
  dockerImageTag: 0.1.0
  dockerRepository: airbyte/source-nexus-datasets
  githubIssueLabel: source-nexus-datasets
  icon: nexus-datasets.svg
  license: MIT
  name: Nexus Datasets
  releaseDate: TODO
  releaseStage: alpha
  supportLevel: community
  documentationUrl: <https://docs.airbyte.com/integrations/sources/nexus-datasets>
  tags:
    - language:python
    - cdk:low-code
  connectorTestSuitesOptions:
    - suite: liveTests
      testConnections:
        - name: nexus_datasets_config_dev_null
          id: TODO
    - suite: unitTests
    - suite: acceptanceTests
      #optional: if your acceptance tests need credentials from CI secret store
      #testSecrets:
      #  - name: SECRET_SOURCE-NEXUS-DATASETS__CREDS
      #    fileName: config.json
      #    secretStore:
      #      type: GSM
      #      alias: airbyte-connector-testing-secret-store       
metadataSpecVersion: "1.0"

This is the Pyproject.toml

Copy code

-------------------------
Pyproject.toml
-------------------------
[build-system]
requires = [ "poetry-core>=1.0.0",]
build-backend = "poetry.core.masonry.api"

[tool.poetry]
version = "0.3.21"
name = "source-nexus-datasets"
description = "Source implementation for nexus-datasets."
authors = [ "Airbyte <contact@airbyte.io>",]
license = "MIT"
readme = "README.md"
documentation = "<https://docs.airbyte.com/integrations/sources/nexus-datasets>"
homepage = "<https://airbyte.com>"
repository = "<https://github.com/airbytehq/airbyte>"
packages = [ { include = "source_nexus_datasets" }, {include = "main.py" } ]

[tool.poetry.dependencies]
python = ">=3.10,<3.12"
airbyte-cdk = "6.60.0"
pyarrow = ">=16.1,<22.0"
pandas = "2.2.2"

[tool.poetry.scripts]
source-nexus-datasets = "source_nexus_datasets.run:run"

[tool.poetry.group.dev.dependencies]
requests-mock = "^1.9.3"
pytest-mock = "^3.6.1"
pytest = "^8.0.0"

So this is the error I can see in the logs,

Copy code

Traceback (most recent call last):
  File "/usr/local/bin/pyab", line 5, in <module>
    from airbyte.cli import cli
  File "/usr/local/lib/python3.11/site-packages/airbyte/__init__.py", line 126, in <module>
    from airbyte import (
  File "/usr/local/lib/python3.11/site-packages/airbyte/cloud/__init__.py", line 59, in <module>
    from airbyte.cloud import connections, constants, sync_results, workspaces
  File "/usr/local/lib/python3.11/site-packages/airbyte/cloud/workspaces.py", line 27, in <module>
    from airbyte.sources.base import Source
  File "/usr/local/lib/python3.11/site-packages/airbyte/sources/__init__.py", line 6, in <module>
    from airbyte.sources import base, util
  File "/usr/local/lib/python3.11/site-packages/airbyte/sources/util.py", line 10, in <module>
    from airbyte._executors.util import get_connector_executor
  File "/usr/local/lib/python3.11/site-packages/airbyte/_executors/util.py", line 13, in <module>
    from airbyte._executors.declarative import DeclarativeExecutor
  File "/usr/local/lib/python3.11/site-packages/airbyte/_executors/declarative.py", line 14, in <module>
    from airbyte_cdk.sources.declarative.manifest_declarative_source import ManifestDeclarativeSource
ModuleNotFoundError: No module named 'airbyte_cdk.sources.declarative.manifest_declarative_source'

Can anyone please help me on this ?

MTA

10/03/2025, 7:53 AM

Hi Everyone, Using Airbyte cloud currently, and building a new connector through the UI for an API source. via oAUTH2. The source API needs client_id, client_secret so no problem for that, but it needs a third parameter (id) in order to generate the token. In the UI, I don't see where I can add a custom extra parameter. I tought this might not be the first time this issue has been raised, so I tought to ask the community. Ideally, we would like to avoid going through a development (cdk) because the client will not be able to maintain the code in the long run. Thanks for any guidance or help on this.

Daniel Popowich

10/03/2025, 4:09 PM

I am developing a custom connector and am using

abctl

for testing. I have runtime features I would like to turn on/off during testing without having to rebuild my image. How do I set (and later change) an arbitrary environment variable for my connector. For example, I can run a source connector image from my command-line like this:

Copy code

docker run --rm $MYIMAGE spec

But if I want to turn on a custom feature I could do this:

Copy code

docker run --rm -e MYCUSTOMENV=true $MYIMAGE spec

How do I do this while running in the

abctl

(kind) environment?

Mike Braden

10/06/2025, 5:03 PM

EDIT: I think this was just a version issue. Using latest 1.8.5 had no problem with parent stream/partitions in the polling URL, I am developing a custom connector using the Connection Builder. I have successfully configured sync streams partitioned off of a parent stream. I need to convert a couple of my streams to be async. I am having issue with the partitioning, however. I can use

stream_partition.<id_name>

for the Creation and Download URLs but when used in the Polling URL it inserts a null value for the stream partition current ID, causing the request to fail. Am I missing something? Is stream partitioning not supported for the polling portion of the async job?

✅ 1

Sunny Matharu

10/06/2025, 10:16 PM

hello, I have recently been exploring Airbyte open source. I can see that there is a Salesforce source connector, (which is a bit clunky to configure but works fine). However I could not see a Salesforce destination connector. (Is that only available with Airbyte paid plans?) So, I decided to use this community YAML API spec of the Salesforce Bulk API 2.0 (which renders and works correctly in Swagger, and imports successfully on Postman). However when selecting said YAML file in the connection builder, it errors out with the following error + an enormous stack trace..

Copy code

Could not validate your connector as specified:

Error handling request: Validation against json schema defined in declarative_component_schema.yaml schema failed

Has anyone successfully managed to set up a custom Salesforce destination connector? Is it even allowed given it seems to be explicitly excluded from the list of connectors in the self-hosted / open source version of AirByte?

Fungai

10/07/2025, 1:37 AM

Hello everyone! I'm looking for a tutorial that shows the process of creating a destination connector. I found tutorials for source connectors but not destination connectors. I know there's open source code for destination connectors but is there any other resource available?

Gloria Usanase

10/08/2025, 7:19 AM

Hello everyone, I am facing issues with the Airbyte on-prem upgrade: After upgrading the ClickHouse destination from v1 to v2.1.4 , syncs have become much heavier. Before the upgrade, a full DB sync finished in *<30 min*; now only small tables complete after 5 hours, while most large tables either queue or fail,. Logs show the source occasionally dying mid-snapshot with

org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend (EOF)

from the CTID/Xmin iterator. What’s already tried: Verified CH side (Atomic DB, correct namespace, no cluster, JSON disabled), limited concurrency to 1, ran Full refresh | Overwrite for big tables, and adjusted the Postgres source (JDBC `socketTimeout=0&tcpKeepAlive=true&defaultRowFetchSize=1000`; role GUCs removing statement/idle timeouts). Inserts into CH occur in 2k batches, but overall throughput is far slower than v1 and several large streams still don’t finish. Looking for any known v2.1.x performance regressions or tuning tips for PG→CH initial loads (batch/window sizes, fetch size, CTID vs Xmin, CDC for big tables, etc.).

Zeineb Ben Nessib

10/09/2025, 9:19 AM

Hello everyone! I'm trying to update pennylane connector to adapt it to the V2 of the api. I'm facing an issue when I update the pagination section which should go from an incremental method to a cursor one. I updated it accordingly but I'm getting a 500 Internal Server Error. Here's my set up in the screenshot below Thanks in advance for your help on this!

Jean Baptiste Guerraz

10/14/2025, 11:08 PM

Hello mates! I'm planning to write a (set of) custom java connector (first, destination and then source). I'm having troubles figuring out the right way to do so: I know java cdk is "on the way" somehow but maybe we can still do something like now ? does anyone here have some related insights ? thank you a lot !

J Bob

10/17/2025, 9:13 AM

I've created a Custom Connector using the Builder, but I can't find the API, for extracting and loading the yaml, any ideas?

Kanhaiya Gupta

10/17/2025, 12:33 PM

Hi , Can we achieve real-time replication from a MySQL database using Change Data Capture (CDC). Currently on UI it showing minimum 1 hours to sync with destination

Göktuğ Aşcı

10/17/2025, 7:26 PM

Dear Airbyte community, I have started to develop custom connectors for my company via the cool Connector Builder feature. But I am a little bit confused with the incremental sync section. Is there a way to set the stream to use last max cursor value and generate the datetime cutoff dynamically? What is the best practice here?

J Bob

10/18/2025, 1:55 PM

I've update my version. of airbyte and when I try to reload my Connector yaml file I get a cryptic: "Your YAML manifest is not compatible with the Connector Builder UI, due to the following error: An unknown error has occurred There is no UI value to switch back to. Please resolve this error with the YAML editor in order to switch to UI mode." So not very helpful, the YAML editor doesn't seem to have any validation, is there a schema I can validate the yaml againts, any clues what so ever?

Isaac Harris-Holt

10/22/2025, 11:19 AM

Hey folks, we're integrating with an API that has slightly strange auth requirements. The flow looks like this: • Open a session • Reuse the same session for the sync on every stream • Explicitly close the session This doesn't appear to be possible with Airbyte Cloud, unless I'm missing something. Would appreciate any guidance.

Aymen NEGUEZ

10/24/2025, 1:24 PM

The current HubSpot destination in Airbyte already supports writing to both standard objects (Contacts, Companies, Deals, etc.) and custom objects. However, it does not yet provide native support for creating associations between these records. This feature request proposes extending the HubSpot destination to handle associations via the HubSpot API. For example: Linking a Contact to a Company Associating a Deal with multiple Contacts Relating Custom Objects to standard objects (e.g., a custom Subscription object linked to a Contact) Supporting associations between custom objects themselves Key benefits: Preserve the relational structure of CRM data when syncing into HubSpot. Ensure that objects written via Airbyte reflect real-world business relationships. Enable more advanced HubSpot use cases by leveraging both default and custom associations. Potential implementation details: Extend the destination configuration to define association mappings (e.g., contactId → companyId). Support both default HubSpot associations and custom associations defined in the HubSpot account. Handle upserts gracefully to prevent duplicate or broken associations. Adding this functionality would make the HubSpot destination more complete and better aligned with HubSpot’s data model.

Lucas Hild

10/24/2025, 5:40 PM

Hey, I am building a custom connector for DoiT to pull cloud costs. This is the response schema:

Copy code

[
  {
    "result": {
      "rows": [
        [
          "amazon-web-services",
          "2025",
          "10",
          "14",
          123.456,
          1760400000
        ],
        [
          "microsoft-azure",
          "2025",
          "10",
          "12",
          654.321,
          1760227200
        ],
        ...
      ]
    }
  }
]

How can I pull access these fields properly? When I select the field path

result, rows

, I get this error:

Exception while syncing stream cloud_costs: dictionary update sequence element #0 has length 12; 2 is required

Is there any way to transform this data properly? Thanks!

Ляшик Іван

10/27/2025, 9:22 AM

Hello, We are using an Incremental | Append + Deduped sync mode in our connection, where the cursor field is based on a datetime column (e.g.,

updated_at

). We’ve noticed a potential issue with how Airbyte handles incremental syncs: If new records appear in the source with the same cursor value as the last recorded cursor (for example,

2025-10-20 09:42:11

), these rows are not picked up in the next sync. As far as we understand, this happens because Airbyte always applies the condition cursor_field > last_cursor_value
, not >=
, when filtering incremental data. This creates a risk of data loss if multiple rows share the same timestamp down to the same second or microsecond — which is common in high-frequency data sources. Could you please confirm: 1. Whether there is a way to configure or modify this behavior (e.g., use

>=

or apply a grace period/offset for the cursor)? 2. If not, is there any recommended best practice or workaround (such as cursor windowing or state rollback) to avoid missing records that share the same cursor value? Thank you for your help and clarification. Best regards!

Stockton Fisher

11/02/2025, 7:38 AM

I'm using the connector builder. It was able to fetch my streams for testing before, but now it just seems to be dead. Any tips on understanding what the error is so I can fix it? Version: Airbyte Cloud

Pranay S

11/03/2025, 12:39 PM

I am trying to use airbyte api to fetch workspace names (or any other request) i am getting a forbidden error. currently im on a free trial plan which has around 400 credit tokens please let me know why am i not getting lists

Florian Brüser

11/03/2025, 4:20 PM

Anyone else experiencing issues when trying to build connectors using the Connector Builder? I currently can't edit any of my connectors that have previously worked. I also can't edit newly forked connectors from marketplace sources (such as Tempo), while the same has worked a few hours ago.

Simon Duvergier

11/03/2025, 4:48 PM

Hello 👋 I am not sure to understand the contribution process when using the Connector Builder 🤔 I have: • Use the connector builder to update source-lemlist existing connector • Use the connector builder to publish my changes • This added a Pull Request: https://github.com/airbytehq/airbyte/pull/69146 I am not sure of the next steps. My understanding of the Pull Request messages is that I should:

run the `bump-version` Airbyte-CI command locally

Should I try to add a commit on my existing PR ?

Connector CI Tests. Some failures here may be expected if your tests require credentials. Please review these results to ensure (1) unit tests are passing, if applicable, and (2) integration tests pass to the degree possible and expected.

I have some tests failing because no credentials are provided, but I am not sure how I can edit/update tests when using the Connector Builder, nor if I should update these tests ? I would be more than happy to be redirect to a documentation/better channel 🙂

Levon Galstyan

11/04/2025, 2:34 PM

Hey! [First contribution PR] to add a stream using the Builder to an existing Klaus connector Everything worked well during testing, not sure if anything else is expected from me to proceed Thank you

David Aichelin

11/06/2025, 10:28 AM

Hi everyone, I’m using Airbyte with the HubSpot destination connector. I’m trying to update properties in the Products table, but I’m running into issues when creating the pipeline — I can’t complete the validation step at the end of the setup. According to the Airbyte documentation , only Companies, Contacts, Deals, and Custom Objects are supported. However, Airbyte still gives me the option to select Products when creating the pipeline. I’d like to know if anyone has successfully used the HubSpot destination connector with the Products table? Thanks! 🙏

Sai Manaswini Reddy I

11/07/2025, 3:55 PM

Hi all, so I am using airbyte to migrate data from mongo to postgres airbyte_internal is being created but airbyte_raw is not being created data size is huge in one particular table in airbyte_internal, around 80 GB now this data is not reflecting to airbyte_raw can I please connect with someone for help.

William Kaper

11/12/2025, 1:57 PM

Hi all / Airbyte team, Is there any ETA on when existing core destinations (like Postgresql) will be migrated from typing and deduping to direct load?

Vinayak R

11/13/2025, 3:48 AM

Hello #C027KKE4BCZ, I want to use Airbyte to move XML/EDI/CCDA files around Clouds and APIs! But these are not supported via existing Cloud Storage Connectors (Azure Blob/S3/GCS). So to start building a custom connector is there a skeleton code somewhere I can refer to?

Jonathan Ben-Harosh

11/13/2025, 7:33 AM

Hello 👋, I've been trying to develop a connector that will allow Airbyte users to get web data to their data pipelines. It seems I can’t develop this connector using your no-code Connector Builder (YAML framework) because our API requires custom logic for: 1. Asynchronous polling (a trigger -> poll -> fetch workflow) 2. Complex batch handling (array-based JSON request bodies) 3. Multi-endpoint orchestration to deliver data These features are supported in the Python CDK. However, I believe that the new connector would fall outside the accepted contributions criteria. Are there any steps I can take to receive special approval/consideration for this connector? Thank you!

Austin Fay

11/14/2025, 6:00 PM

I've been trying to implement a connector to interact with the High Level CRM oauth api, which uses single-use refresh tokens. I've followed the docs and have come up with the following yaml:

Copy code

base_requester:
    type: HttpRequester
    url_base: <https://services.leadconnectorhq.com/>
    authenticator:
      type: OAuthAuthenticator
      grant_type: refresh_token
      client_id: "{{ config['client_id'] }}"
      client_secret: "{{ config['client_secret'] }}"
      refresh_token: "{{ config['refresh_token'] }}"
      access_token_name: access_token
      refresh_request_body:
        grant_type: refresh_token
        client_id: "{{ config['client_id'] }}"
        client_secret: "{{ config['client_secret'] }}"
        refresh_token: "{{ config['refresh_token'] }}"
      refresh_token_updater:
        refresh_token_name: refresh_token
        refresh_token_config_path:
          - refresh_token

...

spec:
advanced_auth:
      oauth_config_specification:
        complete_oauth_server_input_specification:
          required:
            - client_id
        complete_oauth_output_specification:
          required:
            - access_token
            - refresh_token
            - token_expiry_date
          properties:
            access_token:
              type: string
              path_in_connector_config:
                - access_token
              path_in_oauth_response:
                - access_token
            refresh_token:
              type: string
              path_in_connector_config:
                - refresh_token
              path_in_oauth_response:
                - refresh_token
            token_expiry_date:
              type: number
              path_in_connector_config:
                - token_expiry_date
              path_in_oauth_response:
                - expires_in

When I test it in the connector builder, it works fine. However, when I create a real source/connection with it, it manages to test the source successfully, then upon actually syncing the connection, it fails to authenticate with the refresh token. The other thing to note is that this worked until about a month ago, and then it simply stopped working. I'm not sure what happened, but we tried everything including a full re-install of airbyte. Am I missing something here?

👀 1

Anna

11/18/2025, 10:53 AM

I build a custom connector to load json data to Postgresql. but it fails all the time with the error

Stack Trace: java.lang.RuntimeException: org.postgresql.util.PSQLException: ERROR: index row requires 65304 bytes, maximum size is 8191

at io.airbyte.cdk.integrations.destination.jdbc.JdbcBufferedConsumerFactory.onCloseFunction$lambda$8(JdbcBufferedConsumerFactory.kt:308)

at io.airbyte.cdk.integrations.destination.async.AsyncStreamConsumer.close(AsyncStreamConsumer.kt:198)

at io.airbyte.cdk.integrations.base.SerializedAirbyteMessageConsumer$Companion$appendOnClose$1.close(SerializedAirbyteMessageConsumer.kt:70)

at kotlin.jdk7.AutoCloseableKt.closeFinally(AutoCloseableJVM.kt:48)

at io.airbyte.cdk.integrations.base.IntegrationRunner.runInternal(IntegrationRunner.kt:215)

at io.airbyte.cdk.integrations.base.IntegrationRunner.run(IntegrationRunner.kt:119)

at io.airbyte.cdk.integrations.base.IntegrationRunner.run$default(IntegrationRunner.kt:113)

at io.airbyte.integrations.destination.postgres.PostgresDestination$Companion.main(PostgresDestination.kt:230)

at io.airbyte.integrations.destination.postgres.PostgresDestination.main(PostgresDestination.kt)

Caused by: org.postgresql.util.PSQLException: ERROR: index row requires 65304 bytes, maximum size is 8191

at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2733)

I tried to change sync mode to Full Refresh / Append, Full Refresh / Owerwrite, to avoid using primary key as there is none in json file. how do i approach this?