https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • t

    Thomas Pedot

    01/23/2023, 1:07 PM
    Hello, I am testing a K8S deployment with k3s (for testing). I am facing this issue with the worker :
    Copy code
    Message: Failed to inject value for parameter [localDockerMount] of method [checkDockerProcessFactory] of class: io.airbyte.workers.process.ProcessFactory
    Message: Error resolving property value [${airbyte.local.docker-mount}]. Property doesn't exist
    I feel that it is it related to my k3s setup ? I don't see where localDockerMount can be set ? For the context : I am trying to create a custom connector with a gitlab private registry. I have setup my secret to connect to it. I don't know yet if it is good (and not sure it is related to this error either).
    u
    • 2
    • 3
  • r

    Rishabh Jain

    01/23/2023, 2:08 PM
    I am trying to setup replication slot using wal2json plugin. I have created the slot and publication in Postgres. And provided the values to Airbyte. But when I try to test the connection I get an error saying that the “_Expected exactly one replication slot but found 0_”. But I do have replication slot created in Postgres. Not sure why Airbyte is unable to find it. pgoutput plugin works perfectly fine with Airbyte. Below screenshot for wal2json.
    m
    i
    • 3
    • 7
  • t

    Thomas Pedot

    01/23/2023, 2:11 PM
    Setup an Incremental and Inject as a request parameter I am confused on where to implement date incremental sync with parameters. This is the OpenAPI endpoint : https://axonaut.com/api/v2/doc /api/v2/invoices I was able to have full_sync with
    Copy code
    version: "0.1.0"
    
    definitions:
      selector:
        extractor:
          field_pointer: []
      requester:
        url_base: "<https://axonaut.com/api/v2>"
        http_method: "GET"
        authenticator:
          type: ApiKeyAuthenticator
          header: "userApiKey"
          api_token: "{{ config['userApiKey'] }}"    
        request_options_provider:
          request_parameters:
            date_after: "{{ config['date_after'] }}"      
      stream_slicer:
        type: "DatetimeStreamSlicer"
        start_datetime:
          datetime: "{{ config['date_after'] }}"
          datetime_format: "%d/%m/%Y"
        end_datetime:
          datetime: "{{ now_utc() }}"
          datetime_format: "%Y-%m-%d %H:%M:%S.%f+00:00"
        step: "P1D"
        datetime_format: "%d/%m/%Y"
        cursor_field: "{{ options['stream_cursor_field'] }}"   
        cursor_granularity: "P1D"
      retriever:
        record_selector:
          $ref: "*ref(definitions.selector)"
        paginator:
          type: NoPagination
        requester:
          $ref: "*ref(definitions.requester)"
        stream_slicer:
          $ref: "*ref(definitions.stream_slicer)"      
      base_stream:
        retriever:
          $ref: "*ref(definitions.retriever)"
      invoices_stream:
        $ref: "*ref(definitions.base_stream)"
        $options:
          name: "invoices"
          primary_key: "id"
          path: "/invoices"
          request: ""
          stream_cursor_field: "date"
    
    streams:
      - "*ref(definitions.invoices_stream)"
    
    check:
      stream_names:
        - "invoices"
    
    spec:
      documentation_url: <https://docs.airbyte.com/integrations/sources/axonaut>
      connection_specification:
        title: Axonaut Spec
        type: object
        required:
          - userApiKey
        additionalProperties: true
        properties:
          # 'TODO: This schema defines the configuration required for the source. This usually involves metadata such as database and/or authentication information.':
          userApiKey:
            type: string
            description: API access key used to retrieve data from the Exchange Rates API.
            airbyte_secret: true
          date_after:
            type: string
            description: Start getting data from that date.
            pattern: ^(0[1-9]|[12][0-9]|3[01])[- /.](0[1-9]|1[012])[- /.](19|20)\d\d$
            examples:
              - "%d/%m/%Y"
    n
    u
    • 3
    • 6
  • r

    Roy Ben Dov

    01/23/2023, 3:12 PM
    Hey! doing a poc.. I deployed a local airbyte I tested the connection with the postgres as source through the ssh and its working. when I am trying to established a replication connection (source: postgres -> dest: local folder csv) I am getting:
    The connection tests failed.
    Could not connect with provided configuration. Error: Expected exactly one replication slot but found 0. Please read the docs and add a replication slot to your database.
    I followed through the documentation and a replication slot exists in the db
    s
    • 2
    • 3
  • a

    Alexandre Keo

    01/23/2023, 3:48 PM
    Hi everyone, does anyone know if the Octavia CLI is/will be available on Airbyte Cloud ? If not, does anyone know if we can configure Airbyte Cloud sources and destinations as code using another method ? (so it can be versioned and stored in a github repository) I'm currently assessing Cloud ingestion tools for my Data stack and Airbyte kinda catch my eyes but it seems that the cloud version doesn't have this feature (which is really important for me)
    u
    u
    • 3
    • 4
  • o

    Omer Kolodny

    01/23/2023, 4:18 PM
    Hi guys, i'm running a POC on my local machine trying to read data from ChargeBee as a source and Databricks as a destination. i ran 1 stream which work fine, then i ran the other 11 available streams from chargebee and it worked. when i ran again a SYNC it keeps failing with this error:
    Copy code
    2023-01-23 12:39:23 INFO i.a.w.g.DefaultReplicationWorker(getReplicationOutput):482 - failures: [ {
      "failureOrigin" : "destination",
      "internalMessage" : "io.airbyte.workers.general.DefaultReplicationWorker$DestinationException: Destination process message delivery failed",
      "externalMessage" : "Something went wrong within the destination connector",
    Any idea what is this error?
    n
    f
    +2
    • 5
    • 17
  • p

    Peter Kong

    01/23/2023, 4:56 PM
    Bug report Environment: Airbyte hosted, standard container, Postgres<>BigQuery connection Steps to reproduce: 1. Starting state:
    normalized tabular data
    is toggled; tables are fully synced. 2. I edit the
    Replication
    settings and remove a stream from my connection 3. A few minutes later, observe that
    Raw data (JSON)
    is toggled in the Transformation tab. This is concerning because I don't know the state of my data (whether it's normalized or not), or why the transformation setting has reverted to
    Raw data (JSON)
    . Thank you for your help.
    n
    • 2
    • 3
  • h

    Hassan Shahid

    01/23/2023, 5:28 PM
    Hi - I am trying to customize the destination-redshift connector with a trivial change to it (adding
    TRUNCATECOLUMNS
    in the list of redshift COPY options). I’ve got the custom image pushed up to our private registry and added as a connector to our airbyte instance. When I use it, it fails with this error:
    Copy code
    Caused by: java.lang.IllegalStateException: Requested normalization for <http://xxx.dkr.ecr.xxx.amazonaws.com/xxx/destination-redshift:0.1.0|xxx.dkr.ecr.xxx.amazonaws.com/xxx/destination-redshift:0.1.0>, but it is not included in the normalization mappings.
    	at io.airbyte.workers.normalization.NormalizationRunnerFactory.getNormalizationInfoForConnector(NormalizationRunnerFactory.java:57) ~[io.airbyte-airbyte-workers-0.39.28-alpha.jar:?]
    	at io.airbyte.workers.normalization.NormalizationRunnerFactory.create(NormalizationRunnerFactory.java:43) ~[io.airbyte-airbyte-workers-0.39.28-alpha.jar:?]
    	at io.airbyte.workers.temporal.sync.NormalizationActivityImpl.lambda$getLegacyWorkerFactory$5(NormalizationActivityImpl.java:115) ~[io.airbyte-airbyte-workers-0.39.28-alpha.jar:?]
    	at io.airbyte.workers.temporal.TemporalAttemptExecution.get(TemporalAttemptExecution.java:118) ~[io.airbyte-airbyte-workers-0.39.28-alpha.jar:?]
    	at io.airbyte.workers.temporal.sync.NormalizationActivityImpl.lambda$normalize$3(NormalizationActivityImpl.java:103) ~[io.airbyte-airbyte-workers-0.39.28-alpha.jar:?]
    	at io.airbyte.workers.temporal.TemporalUtils.withBackgroundHeartbeat(TemporalUtils.java:284) ~[io.airbyte-airbyte-workers-0.39.28-alpha.jar:?]
    I’ve looked through the forum, stack overflow, google. and searched through this slack, but can’t find any mention of this failure. Can anyone point me in the right direction?
    u
    • 2
    • 1
  • m

    Massy Bourennani

    01/23/2023, 6:32 PM
    👋 Hello In my org we have a classic Airbyte/dbt architecture. When it comes to materialization of incremental models we are discussing how we fetch new rows from airbyte raw tables to construct fact/dim tables. We are considering two definitions: 1. new rows according to
    airbyte_emitted_at
    column 2. new rows according to a
    cursor_column
    most of the time it’s an
    updated_at
    (a column that airbyte would typically use to sync streams incrementally) wanted to know how everyone was tackling this. Example in thread 🧵
    ➕ 4
    m
    a
    u
    • 4
    • 8
  • a

    Andres Gutierrez

    01/23/2023, 7:56 PM
    Hi, quick question. We're using Google sheets connector. And we want to cast columns in the Google sheet as integers. I think the answer is
    NO
    but I wanted to ask if is possible in some way to do casting of Sheet's columns with this connector? https://docs.airbyte.com/integrations/sources/google-sheets/#data-type-mapping
    m
    u
    u
    • 4
    • 7
  • w

    Walker Philips

    01/23/2023, 8:05 PM
    For source connectors, if an exception occurs during reading records, is it normal for the state to not be stored? I am seeing "Source did not output any state messages" and "State capture: No state retained." in the logs and wanted to confirm this is expected behavior.
    n
    • 2
    • 7
  • c

    Chen Lin

    01/23/2023, 9:35 PM
    Hi everyone, I'm having some trouble pulling
    display_keyword_performance_report
    stream data, I don't see any obvious errors from the log but the file that's written in s3 bucket is empty, I don't have any issue with other streams like ad_group_ad_report or campaigns, attached is the log file, can you guys suggest where I should look into? Thanks!
    logs-239873.txt
    u
    c
    +2
    • 5
    • 6
  • a

    Adrian Bakula

    01/23/2023, 10:22 PM
    Hey hey! Looks like the snowflake destination connector always sets the table retention policy to 0 days on table creation. Any reason for this? Would it make sense to make this configurable or at least allow to not have this line in case we want the retention policy to just follow the schema level retention? Thanks 😄
    ➕ 4
    u
    u
    u
    • 4
    • 5
  • a

    Amit Khanna

    01/23/2023, 10:47 PM
    Hi all. I am new to airbyte and require your help. I am working on an datawarehouse project initiative in my company. In short, we are creating a datawarehouse (batch based 1 day latency) for our product customers (10000) where we will get data from 5 different products (total 10000 customer on premise sources). We have evaluated snowflake or bigquery and it is right fit based on our scale and usecase. We are evaluating airbyte tool to get data from 10000 on premise customers ( 5 unique products) . Source dB are SQL server , sybase. We want to go with push mechanism so that 10000 customers can push data from on premise SQL server dB or Sybase dB to snowflake dB. This will help us distribute compute. Does airbyte has any agent for on premise dB to cloud DW connectivity? Any other method to achieve push mechanism. As per our understanding, elt doing pull from 10000 sources wouldn't be feasible
    n
    • 2
    • 6
  • j

    Jason Maddern

    01/23/2023, 11:16 PM
    I have an API based source connector which pulls JSON, with a basic DBT normalisation auto normalises nested JSON structures into separate tables. That's great, however I cannot for the life of me figure out how to join these tables back together - there is some sort of hash
    _airbyte_ab_id
    , but I cannot see how to join on that key. Can anyone advise how to join normalised JSON post airbyte run? I must be missing something obvious I'm using airbyte latest, connected to snowflake
    a
    • 2
    • 2
  • a

    Abhinav Kapur

    01/23/2023, 11:25 PM
    I am exploring Airbyte for creating data pipelines in my project. I would really appreciate if you can provide answer to the following questions. • Cost? ◦ What is the product's potential cost range? License? Open Source is free, cloud starts at $2.50 per credit and enterprise is custom which depends upon the customer requirements. • Extensibility ◦ Integration with our Azure Active Directory identities and roles? ◦ Integration with Azure Key Vault? ◦ Integration with our custom Haligi webservices (and Themis/OAuth) • Deployment ◦ Are docker images available, how can we secure them or rebuild them? ◦ Is it a cloud only SaS product. ◦ Azure Marketplace support? ◦ Is product available in an azure deployable form? • Operationalizing ◦ Integration with DataDog/Enterprise Splunk ◦ Alerting/Monitoring • Timelines on acquisition ◦ Estimate how long to get a license?
    ✅ 1
    u
    • 2
    • 2
  • n

    Nathan Chan

    01/24/2023, 1:11 AM
    Hi Airbyte team, just wondering if this issue being looking into? https://github.com/airbytehq/airbyte/issues/21637 Even though we have upgraded to the latest version of airbyte and connectors this issue still exists
    n
    • 2
    • 3
  • j

    Joey Taleño

    01/24/2023, 4:17 AM
    Hello Airbyte Team, Adding custom connector is still not supported in Airbyte Cloud?
    a
    • 2
    • 2
  • s

    Shreshth Arora

    01/24/2023, 6:11 AM
    Hello Airbyte Team, We are having scaling Issues while syncing BigQuery to Clickhouse. We are using Airbyte in Docker Containers on 1 x c2d-standard-32 GCP Machine (8vCPU , 32GB Memory) and have made multiple attempts to tune the parameters for faster BigQuery to Clickhouse loads but even after increasing config Variables there seems to be no change in the system.
    Copy code
    JOB_MAIN_CONTAINER_MEMORY_REQUEST=12g
    JOB_MAIN_CONTAINER_MEMORY_LIMIT= 12g
    JOB_MAIN_CONTAINER_CPU_REQUEST= 0.75
    JOB_MAIN_CONTAINER_CPU_LIMIT= 0.80
    
    MAX_SYNC_WORKERS=20
    MAX_SPEC_WORKERS=20
    MAX_CHECK_WORKERS=20
    MAX_DISCOVER_WORKERS=20
    MAX_NOTIFY_WORKERS=20
    Similar Configs for Normalisation Container Variables also. We have run multiple syncs of multiple sizes but haven’t been able to increase the ETA of sync and is constant. Usage of the machine from airbyte to isn’t more than 40% of CPU and 4GBs of RAM (out of 32GB). With current test sync speeds Airbyte would take days to sync data for our needs (BigQuery to Clickhouse) 1) Anyway to configure the system in the best possible way for full utilization of the machine ? 2) What are the best practices for scaling Airbyte docker deployment to its full potential?
    r
    u
    • 3
    • 2
  • s

    Suprakash Nandy

    01/24/2023, 6:35 AM
    Hi everyone, we are evaluating airbyte for CDC from postgress to S3. We realized that the synced data is S3 does not have any column to identify whether the record is deleted from the source. I was expecting 'deleted_at' column to be present. I did not find any option to enable it and neither I could find anything in the documentation. Is this an expected behavior? Can we identify deletes when destination is S3?
    u
    • 2
    • 1
  • r

    Roy Ben Dov

    01/24/2023, 8:54 AM
    Hey, I deployed a local air byte for poc, I had a running connection with postgres db with ssh tunnel to a local csv . I started getting:
    Copy code
    Could not connect with provided SSH configuration. Error: org.apache.sshd.common.SshException: No more authentication methods available
    can someone elaborate ?
    n
    c
    • 3
    • 6
  • g

    Grember Yohan

    01/24/2023, 8:57 AM
    Resharing an unanswered message from last week: Hello Airbyte community 👋 We tried upgrading Airbyte from 0.40.27 to 0.40.28, and since then, we can't access sync logs from the UI 😞 We can see the sync history properly but when we try to click on one specific sync to see the logs, it triggers the infamous 'Oops! Something went wrong' page, and the following error appears in the console:
    Copy code
    react-dom.production.min.js:216 Error: Internal Server Error: Cannot invoke "io.airbyte.config.storage.CloudStorageConfigs.getType()" because the return value of "io.airbyte.config.helpers.LogConfigs.getStorageConfigs()" is null
        at apiOverride.ts:107:9
        at f (regeneratorRuntime.js:86:17)
        at Generator._invoke (regeneratorRuntime.js:66:24)
        at Generator.next (regeneratorRuntime.js:117:21)
        at r (asyncToGenerator.js:3:20)
        at u (asyncToGenerator.js:25:9)
    Downgrading from 0.40.28 to 0.40.27 fixes this issue. Should I share this somewhere specific to document this regression and improve its chances to be fixed?
    m
    • 2
    • 1
  • k

    Keurcien Luu

    01/24/2023, 10:23 AM
    Hi everyone! Is there anyone using the Braintree connector here, having the same issue? We would like to use the Braintree connector to store our transactions in our data warehouse since 2019, but due to the search limit (50k on the transaction endpoint: https://developer.paypal.com/braintree/docs/reference/general/searching/search-results/python#search-limit), we only get the last 50k transactions. I could try to contribute on this one but would love some guidance about what to do in the case of a limiting search API. Today the search parameter filters records based on their created time https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-braintree/source_braintree/streams.py:
    Copy code
    def get_items(self, start_date: datetime):
        return self._gateway.transaction.search(braintree.TransactionSearch.created_at >= start_date)
    Maybe we can iterate over the dates since start_date to get more records (in our case 50k per day would be largely enough), but we could run into some rate limiting issues I guess. Any advice on that?
    n
    • 2
    • 1
  • d

    Danilo Drobac

    01/24/2023, 10:49 AM
    Hey folks - generic question here. I'm working on a project where the client doesn't currently use Airbyte. They have a need to connect to an API for a 3rd party (very niche 3rd party) to call some custom endpoints. What are the pro's/con's to building a custom connector in Airbyte for this rather than simply building this separately as a standalone function, for example? I can see the potential long term benefits that if we introduce Airbyte, they have the option to add more sources as they evolve, but in the short term, is it overkill?
    n
    • 2
    • 1
  • a

    Andrzej Lewandowski

    01/24/2023, 10:49 AM
    Hi there, it’s better to setup replica and use it as a db for airbyte or use directly master mysql database? I’m worried about impact on the production database during initial sync. On the other hand I saw that initial sync do only simple select. Do you have any recommendations in this case?
    m
    u
    • 3
    • 3
  • s

    Shraddha Borkar

    01/24/2023, 10:50 AM
    Hello Team, I am using Airbyte Open Source hosted on EKS and trying to set up Confluent Kafka as destination. Getting the below error:
    Could not connect to the Kafka brokers with provided configuration. Failed to construct kafka producer
    Could someone please help here?
    u
    • 2
    • 4
  • s

    Sharath Chandra

    01/24/2023, 11:03 AM
    Did anyone face this issue? Database Error in model <table_name>
    Copy code
    (models/generated/airbyte_tables/airbyte_test/<table_name>.sql)
      Invalid input
      DETAIL:  
        -----------------------------------------------
        error:  Invalid input
        code:      8001
        context:   CONCAT() result too long for type varchar(65535)
        query:     19684211
        location:  string_ops.cpp:108
        process:   query0_121_19684211 [pid=17065]
    a
    n
    • 3
    • 2
  • o

    Omer Kolodny

    01/24/2023, 1:32 PM
    Guys can someone please assist here? whenever i'm running the process after Reset it 'Succeeded' but when i run a 'Sync now' it fails. this is a very weird behavior. any ideas?
    s
    u
    u
    • 4
    • 11
  • a

    Amit Khanna

    01/24/2023, 1:36 PM
    Hi all. I am new to airbyte and require your help. I am working on an datawarehouse project initiative in my company. In short, we are creating a datawarehouse (batch based 1 day latency) for our product customers (10000) where we will get data from 5 different products (total 10000 customer on premise sources). We have evaluated snowflake or bigquery and it is right fit based on our scale and usecase. We are evaluating airbyte tool to get data from 10000 on premise customers ( 5 unique products) . Source dB are SQL server , sybase. We want to go with push mechanism so that 10000 customers can push data from on premise SQL server dB or Sybase dB to snowflake dB. This will help us distribute compute. Does airbyte has any agent for on premise dB to cloud DW connectivity? Any other method to achieve push mechanism. As per our understanding, elt doing pull from 10000 sources wouldn't be feasible
  • i

    Igor Safonov

    01/24/2023, 1:44 PM
    Hi 👋 How could I specify extra parameters for connection to an external database using helm charts?
    a
    n
    u
    • 4
    • 14
1...127128129...245Latest