https://linen.dev logo
Join Slack
Powered by
# help-connector-development
  • y

    Yokesh RS

    06/15/2023, 10:06 AM
    hi all , how can i get the specification of the particular definition id of the source or a destination , in order to create thhem
    k
    j
    • 3
    • 3
  • p

    Patrick Elsen

    06/15/2023, 12:23 PM
    Hey team, We are planning to use Airbyte for our data ingestion pipeline. We are currently evaluating if it is a good fit. To make it useful for our customers, we need to add a "Field33" destination. The plan is to use it to ingest data into our Ontology-powered graph database. Two quick questions: • I'm trying to figure out how best to build our Destination Connector. Since all of our internal tooling is written in Rust, and we like small Docker containers, we would like to write a custom destination in Rust+Docker and create a PR to add it to Airbyte. Is it allowed to use Rust for this? Is there some policy around it? I'm on the hook if I build it and it gets rejected, so this would be useful to know 😛 • I saw that you guys have a connector builder. Am I right in my understanding that this builder is only for custom sources, and not custom destinations? If building a connector is not an option, I could add some routes to our backend to support an existing protocol. Cheers!
    k
    • 2
    • 2
  • p

    Patrick Elsen

    06/15/2023, 12:49 PM
    Am I correct to assume that it is not currently possible to add a private connector, but the only way to hook one up is by a PR to the airbyte repo?
    k
    b
    • 3
    • 4
  • a

    Andy Smith

    06/15/2023, 1:00 PM
    Hi, we are building a custom connector that runs once a day and pulls in data for yesterday from an API. We are sinking to postgres into a single table, so
    Full Refresh - Append
    would seem appropriate. Sometimes, however, data for a given date (e.g. 7 days ago) is updated by the remote server, and we need to re-ingest for that date. which would yield records with duplicate primary keys in the destination table, for that date. The only other sync method that seems close is
    Incremental - Dedupe with History
    , However, this requires a cursor, and because the re-ingested date could be well before the value of the current cursor, it is not clear to me whether the dedupe would work, as it looks like the dedupe just works for records with the last cursor value and above. How can we achieve this behaviour? What we kinda need is the dedupe to work only for the records with the given date (i.e. the newly ingested date).
    k
    • 2
    • 2
  • o

    Octavia Squidington III

    06/15/2023, 1:45 PM
    🔥 Office Hours starts in 15 minutes 🔥 Topic and schedule posted in #C045VK5AF54 octavia loves At 16:00 CEST / 10am EDT click here to join us on Zoom octavia loves
  • t

    Tom Anderson

    06/15/2023, 2:51 PM
    👋 I'm trying to figure out the record selector to get all of the "values" records (in the response below, 2 records). I've tried the following but they are either return nothing, or empty lists: • values • values,* • *,values • ,values, Here is my response example:
    Copy code
    [
      {
        "headers": [
          "trackingId",
          "timeStamp",
          "userId",
          "email",
          "userRole",
          "emailDomain",
          "userGroups",
          "dashboardTitle",
          "dashboardId",
          "dashboardPath",
          "action",
          "loadTime",
          "category",
          "str1",
          "str2",
          "int1",
          "int2"
        ],
        "values": [
          [
            "8abeffd5-af24-4d6f-b929-ac56d7319220",
            "2022-06-28T20:40:17",
            "obfuscated",
            "obfuscated",
            "Sys. Admin",
            "obfuscated",
            "Admins;",
            "N\\A",
            "N\\A",
            "N\\A",
            "page.navigate.analytics",
            "N\\A",
            "General",
            "N\\A",
            "N\\A",
            "N\\A",
            "N\\A"
          ],
          [
            "8abeffd5-af24-4d6f-b929-ac56d7319220",
            "2022-06-28T20:40:17",
            "obfuscated",
            "obfuscated",
            "Sys. Admin",
            "obfuscated",
            "Admins;",
            "N\\A",
            "N\\A",
            "N\\A",
            "page.navigate.analytics",
            "N\\A",
            "General",
            "N\\A",
            "N\\A",
            "N\\A",
            "N\\A"
          ]
        ]
      }
    ]
    CDK Version
    0.44.5
    k
    a
    • 3
    • 4
  • d

    David Anderson

    06/15/2023, 2:55 PM
    if im building a stream in the CDK UI with two sub-streams to build a resource path, how do i denote which substream is which using the
    {{ stream_partition.id }}
    notation logic? i want to end up with something like:
    /path/stream_partition1.id/path/stream_partition2.id
    k
    • 2
    • 2
  • a

    Aazam Thakur

    06/15/2023, 10:21 PM
    For my python connector, I have this class. I want to create a new class
    ListMembers
    which gets the
    list_id
    from the Lists class to use it in the url
    lists/{list_id}/members
    Copy code
    class Lists(IncrementalMailChimpStream):
        cursor_field = "date_created"
        data_field = "lists"
    
        def path(self, **kwargs) -> str:
            return "lists"
    k
    • 2
    • 17
  • h

    Hoang Ho

    06/16/2023, 8:42 AM
    Hello, Our team is currently working on an application, that utilizes Airbyte for seamless data transfer from MongoDB to PostgreSQL. However, we've run into an issue concerning the transfer of DateTime data. During the process, Airbyte automatically converts it into text format, which poses a problem for us. Is there a possible solution to configure the mapping data or resolve this issue? We eagerly await your response and appreciate your assistance. Thank you kindly.
    • 1
    • 1
  • j

    Janis Karimovs

    06/16/2023, 12:14 PM
    Hello everyone, I'm working on a custom source connector for Podio and using BigQuery as destination. I am successfully able to get the data from most of the podio apps that I am testing to Bigquery, but at least one stream is failing with some kind of "pickling" (serialization?) error As far as I can tell the relevant error info from sync logs is here:
    Copy code
    2023-06-16 11:35:10 [42mnormalization[0m > [31mUnhandled error while executing model.airbyte_utils.App_Name_2_0[0m
    Pickling client objects is explicitly not supported.
    Clients have non-trivial state that is local and unpickleable.
    From some of the research that I've done it seems the issue lies within BigQuery connector, but I'm confused about why the other streams (podio apps) which seem more or less the same as the one that's failing are working just fine. Anyone else encounter something similar? Any info on this would be highly appreciated... thanks 🙏
    k
    • 2
    • 2
  • a

    Anthony Smart

    06/16/2023, 12:51 PM
    I also have a question in relation to the Snowflake destination connector. Are there plans to add Azure blob storage as an external staging area? The only alternative is to use internal staging but this will require files to be stored locally before they can be uploaded to the internal stage and copied into a Snowflake table. This will be of course be less performant than using an external stage which can be read directly into Snowflake via COPY INTO. The Web UI is also stating that the internal stage option is recommended for performance and scalability. Clearly this is the less optimal approach compared to an external stage. Any feedback on this would be appreciated. https://docs.airbyte.com/integrations/destinations/snowflake/ Slack Conversation
    k
    • 2
    • 2
  • o

    Octavia Squidington III

    06/16/2023, 7:45 PM
    🔥 Community Office Hours starts in 15 minutes 🔥 At 1pm PDT click here to join us on Zoom!
  • a

    Aazam Thakur

    06/16/2023, 11:32 PM
    Explain me this error
    GetMemberInfo(authenticator=authenticator)\nTypeError: Can't instantiate abstract class GetMemberInfo with abstract method data_field\n", "failure_type": "system_error"}}}
    k
    • 2
    • 2
  • c

    Cody Scott

    06/17/2023, 12:19 AM
    Hi! 👋 Question surrounding incremental sync and substreams. I have an API that I want to incrementally request data from. The parent stream outputs the ID to request, and I would like to incrementally request the data for each child. There is around 1000 known parent ids. Basically can I apply a state to each substream and have it store its state (so 1000 internal states). Other side is do I need to accept that there is going to be duplication and request smaller slices (hourly for example) the dedup early in the transform layers. Output data is ID + date time as the state. If it fails I need it to get the data on the next run, so worst case I duplicate. Ideal case is each stream can run on its own little world happily keeping its own state, but I also don’t want 1000+ tables created if possible… Thanks!
    k
    • 2
    • 3
  • b

    Biondi Septian S

    06/17/2023, 1:46 PM
    Hello Airbyte team, please take a look at this serious issue: https://github.com/airbytehq/airbyte/issues/24097
    k
    • 2
    • 2
  • b

    Biondi Septian S

    06/17/2023, 1:46 PM
    this is a serious timeout issue on typesense destination connector
    k
    • 2
    • 2
  • b

    Biondi Septian S

    06/17/2023, 1:46 PM
    I think this is the solution, by merging this pull request below: https://github.com/airbytehq/airbyte/pull/18806/commits
    k
    • 2
    • 2
  • c

    Chính Bùi Quang

    06/19/2023, 6:41 AM
    Hi AirByte team, I am setting up Builder, there is 1 record returned with the following structure: [{ "item": "deal", "id": 123450, "data": { "id": 123450, "update_time": "2023-06-01 012008", "user_id": 18488038, "person_id": 514665, "org_id": 200989} }, { "item": "deal", "id": 54981, "data": { "id": 54981, "update_time": "2023-06-01 052008", "user_id": 18488038, "person_id": 514665, "org_id": 200989} }] Now I want to get the maximum value of update_time in the returned record and assign it to the Cursor Field in Incremental Sync, what should I do?
    k
    j
    f
    • 4
    • 8
  • c

    Chidambara Ganapathy

    06/19/2023, 9:05 AM
    Hi Airbyte Team, Is AWS cost explorer source connector available out of the box? Thanks
    k
    l
    • 3
    • 3
  • l

    Luke Whittaker

    06/19/2023, 10:37 AM
    I'm building out a connector using the no-code builder. Is it possible to handle UNIX timestamps instead of the stander
    %Y-%m-%d....
    k
    • 2
    • 2
  • l

    laila ribke

    06/19/2023, 11:32 AM
    Hi, does someone has built a source connector for Optimizely and can share the image?
    k
    • 2
    • 2
  • a

    Anthony Smart

    06/19/2023, 12:06 PM
    I also have a question in relation to the Snowflake destination connector. Are there plans to add Azure blob storage as an external staging area? The only alternative is to use internal staging but this will require files to be stored locally before they can be uploaded to the internal stage and copied into a Snowflake table. This will be of course be less performant than using an external stage which can be read directly into Snowflake via COPY INTO. The Web UI is also stating that the internal stage option is recommended for performance and scalability. Clearly this is the less optimal approach compared to an external stage. Any feedback on this would be appreciated. https://docs.airbyte.com/integrations/destinations/snowflake/ Slack Conversation
    k
    • 2
    • 2
  • s

    Slackbot

    06/19/2023, 5:47 PM
    This message was deleted.
    k
    • 2
    • 2
  • o

    Octavia Squidington III

    06/19/2023, 7:45 PM
    🔥 Community Office Hours starts in 15 minutes 🔥 Topic and schedule posted in #C045VK5AF54 octavia loves At 1pm PDT click here to join us on Zoom!
  • t

    Thomas van Latum

    06/19/2023, 7:53 PM
    Is there a Guide to setup a development environment for building a Java connector?
    k
    • 2
    • 2
  • m

    Mahesh Thirunavukarasu

    06/19/2023, 8:08 PM
    Hi, can we declare multiple requester in airbyte low code cdk? If so, will it take a list like structure or we can declare it in different name and refer it ?
    k
    m
    • 3
    • 3
  • m

    Mahesh Thirunavukarasu

    06/19/2023, 9:00 PM
    How to integrate a CDC REST api ? I am trying to implement it in the current Quickbooks connector which is in alpha. Since all the other objects in quickbooks follows Query pattern, I have to declare a separate requester to capture cdc and load the raw tables. Unfortunately, I am always getting 0 records in the loaded cdc table. Also please suggest a way to monitor api requests and responses in Self deployed Airbyte using docker.
    k
    • 2
    • 2
  • m

    Micky

    06/19/2023, 9:22 PM
    Hi, if I drop the replication slot and recreate it and reconfigure Airbyte to use new slot for CDC, does it mean it will have full fresh (initial sync)?
    k
    • 2
    • 2
  • l

    Luis Peña

    06/20/2023, 12:28 AM
    Hello, I'm currently trying my hands on the Low-code CDK. But I'm having some issues understanding two topics: 1. How to implement payloads/body on the request of each stream. Is there any additional information or examples besides the one on: https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/request-options#request-options-1 2.- How to implement nested streams. So far I get that is by using "SubstreamPartitionRouter" but I'm having some issues on how to implement it. Is there any source code I could take a look as an example? I really appreaciate any help on the topic.
    k
    • 2
    • 2
  • q

    Quang Dang Vu Nhat

    06/20/2023, 3:00 AM
    Hello, currently I am developing a custom connector that includes some streams which have fields with multiple datatype I have seen this pattern in some other connectors like in Gitlab
    merge_request_commits
    schema
    Copy code
    "approvals_before_merge": {
          "type": ["null", "boolean", "string", "object"]
    },
    or in Pipedrive
    product_fields
    schema
    Copy code
    "options": {
          "type": ["null", "array"],
          "items": {
            "type": "object",
            "properties": {
              "id": {
                "type": ["null", "boolean", "integer"]
              },
              "label": {
                "type": ["null", "string"]
              }
            }
         }
    }
    I don’t see how they handle multiple datatype in their code, apart from these declaration, Can anyone support me with this 🙇
    k
    • 2
    • 2
1...141516...21Latest