https://linen.dev logo
Join Slack
Powered by
# ask-ai
  • l

    Luis Espinoza

    11/04/2024, 4:12 PM
    Hello, is there any documentation that specifies how to configure Airbyte with EC2 , Application Load Balancer and AWS Certificate manager?
    u
    u
    +2
    • 5
    • 7
  • n

    Nivedita Baliga

    11/04/2024, 4:12 PM
    @kapa.ai After various back and forth in this Slack channel, following various GitHub discussions and making changes to various parameters, at mssql connector level, Airbyte platform level and postgres DB level, we are still facing source > Terminating due to java.lang.OutOfMemoryError: Java heap space error in one of our connections. Anyone able to help us out here as failure of this ingest pipeline is affecting critical data delivery for end-user reports?
    u
    u
    +17
    • 20
    • 28
  • p

    Patrick Henry

    11/04/2024, 4:14 PM
    I want to tranform the data before it gets to my database how can i do that?
    u
    • 2
    • 2
  • v

    Vineeth ER

    11/04/2024, 4:15 PM
    how to fix this on airbyte helm chart "how to fix this in aks cluster "Failed to pull image "airbyte/worker:0.50.35": failed to pull and unpack image "docker.io/airbyte/worker:0.50.35": failed to copy: httpReadSeeker: failed open: unexpected status code https://registry-1.docker.io/v2/airbyte/worker/manifests/sha256:fcfa783c2d02029ba85a7e79b23a38f9fbafdd04907bf00cc474ba5936adf51e: 429 Too Many Requests - Server message: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit "
    u
    • 2
    • 1
  • p

    Poorna Premachandra

    11/04/2024, 4:19 PM
    recently I'm trying to deploy Airbyte on EKS. When doing
    helm install airbyte airbyte/airbyte --namespace airbyte --values ./values.yaml
    I'm getting the following error.
    Copy code
    coalesce.go:286: warning: cannot overwrite table with non table for airbyte.keycloak-setup.initContainers.keycloakReadinessCheck.image (map[])
    coalesce.go:286: warning: cannot overwrite table with non table for airbyte.keycloak.initContainers.initDb.image (map[])
    coalesce.go:286: warning: cannot overwrite table with non table for airbyte.keycloak-setup.initContainers.keycloakReadinessCheck.image (map[])
    coalesce.go:286: warning: cannot overwrite table with non table for airbyte.keycloak.initContainers.initDb.image (map[])
    Error: INSTALLATION FAILED: failed pre-install: 1 error occurred:
    	* timed out waiting for the condition
    During the process airbyte-airbyte-bootloader starts but after the error it get stopped. I double checked the yaml and seems ok. And I didn't change any properties relating to keycloak or keycloak-setup. Also I'm using helm version "v3.16.2". Any idea what is causing this? Thanks.
    u
    • 2
    • 1
  • a

    Anna Bogo

    11/04/2024, 4:43 PM
    are permission to a table lost after a full refresh?
    u
    u
    • 3
    • 3
  • g

    Gideon Stowell

    11/04/2024, 6:18 PM
    @kapa.ai What version of psotgres is required for airbyte?
    u
    • 2
    • 1
  • t

    Thomas Shin

    11/04/2024, 6:31 PM
    When is the next expected release of Airbyte or how often are there new releases? I am trying to get an estimate on when the new Google Calendar source will be released
    u
    u
    +2
    • 5
    • 7
  • l

    Leo Salayog

    11/04/2024, 6:31 PM
    Copy code
    2024-11-04 18:28:29 platform > failures: [ {
      "failureOrigin" : "source",
      "failureType" : "system_error",
      "internalMessage" : "The request failed due to an unknown error.",
      "externalMessage" : "Something went wrong in the connector. See the logs for more details.",
      "metadata" : {
        "attemptNumber" : 0,
        "jobId" : 327,
        "from_trace_message" : true,
        "connector_command" : "read"
      },
    How to set up Response filter so that I can ignore these errors I need info for Error Message Substring Predicate HTTP Codes Action (ignore) Error Message
    u
    • 2
    • 3
  • l

    Leo Salayog

    11/04/2024, 6:54 PM
    is this ymal correct? type: CompositeErrorHandler error_handlers: - type: DefaultErrorHandler max_retries: 4 response_filters: - type: HttpResponseFilter action: IGNORE http_codes: - 500 error_message: Something went wrong in the connector. See the logs for more details. error_message_contains: The request failed due to an unknown error predicate: >- {{ 'failure_type' in response and response['failure_type'] == 'system_error' }}
    u
    k
    u
    • 4
    • 5
  • r

    Ravi Nathwani

    11/04/2024, 6:57 PM
    @kapa.ai I am getting HTTP 504 error while selecting Streams / Discovering Schemas in Airbyte , Its able to connect & fetch the data from same source, while creating new connection it throws error 1. Airbyte Deployed on Kubernetes , Airbyte Version :- 1.1.0 , Helm Chart Version :- 1.1.1 2. MS SQL Connector Version :- 4.1.15 I have tried this parameters in https://docs.airbyte.com/enterprise-setup/scaling-airbyte#schema-discovery-timeouts server: extraEnvs: - name: HTTP_IDLE_TIMEOUT value: 10m - name: READ_TIMEOUT value: 10m Still I got HTTP 504 Error exactly around 60 Sec , In Same Database different schema which have lesser number of Tables its working Fine, In this schema I have around 900 Tables which throws HTTP 504 Error while discovering schema in around 60 Sec Can anyone please help me out as its pretty urgent , I am at Production stage in Dev it was working fine as it has lesser number of Tables Logs of pods which is Created by Airbyte While Discovering Schema
    Copy code
    2024-11-04 18:46:04 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):308 - INFO main c.z.h.HikariDataSource(close):349 HikariPool-1 - Shutdown initiated...
    2024-11-04 18:46:04 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):308 - INFO main c.z.h.HikariDataSource(close):351 HikariPool-1 - Shutdown completed.
    2024-11-04 18:46:04 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):308 - INFO main c.z.h.HikariDataSource(<init>):79 HikariPool-2 - Starting...
    2024-11-04 18:46:04 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):308 - INFO main c.z.h.HikariDataSource(<init>):81 HikariPool-2 - Start completed.
    2024-11-04 18:46:04 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):308 - INFO main i.a.c.i.b.IntegrationRunner(runInternal):224 Completed integration: io.airbyte.cdk.integrations.base.ssh.SshWrappedSource
    2024-11-04 18:46:04 INFO i.a.w.i.VersionedAirbyteStreamFactory(internalLog):308 - INFO main i.a.i.s.m.MssqlSource(main):577 completed source: class io.airbyte.integrations.source.mssql.MssqlSource
    2024-11-04 18:46:04 INFO i.a.c.ConnectorMessageProcessor(updateConfigFromControlMessage):231 - Checking for optional control message...
    2024-11-04 18:46:04 INFO i.a.c.ConnectorMessageProcessor(setOutput):176 - Writing catalog result to API...
    2024-11-04 18:46:05 INFO i.a.c.ConnectorMessageProcessor(setOutput):180 - Finished writing catalog result to API.
    2024-11-04 18:46:05 INFO i.a.c.ConnectorWatcher(saveConnectorOutput):162 - Writing output of b5ea17b1-f170-46dc-bc31-cc744ca984c1_4a1e274f-a661-441b-a3ce-8b08c0912ecd_0_discover to the doc store
    2024-11-04 18:46:06 INFO i.a.c.ConnectorWatcher(markWorkloadSuccess):167 - Marking workload b5ea17b1-f170-46dc-bc31-cc744ca984c1_4a1e274f-a661-441b-a3ce-8b08c0912ecd_0_discover as successful
    2024-11-04 18:46:06 INFO i.a.c.ConnectorWatcher(exitProperly):215 - Deliberately exiting process with code 0.
    2024-11-04 18:46:06 INFO i.a.c.i.LineGobbler(voidCall):166 -
    2024-11-04 18:46:06 INFO i.a.c.i.LineGobbler(voidCall):166 - ----- END DISCOVER -----
    2024-11-04 18:46:06 INFO i.a.c.i.LineGobbler(voidCall):166 -
    u
    u
    +3
    • 6
    • 7
  • d

    Diako

    11/04/2024, 6:58 PM
    @kapa.ai I get the error: Cannot read properties of undefined (reading 'mode')
    u
    u
    u
    • 4
    • 4
  • d

    david balli

    11/04/2024, 8:08 PM
    Hi can I import excel files on sftp
    u
    • 2
    • 2
  • d

    david balli

    11/04/2024, 8:49 PM
    When I m importing excel file from sftp, I need to skip first 4 rows, what should be my reader options
    u
    • 2
    • 1
  • e

    Ethan Brown

    11/04/2024, 10:28 PM
    How can I connect to the api of a OSS self-managed instance of Airbyte?
    u
    • 2
    • 1
  • v

    vijay vijay

    11/04/2024, 11:15 PM
    Hi #C01AHCD885S I'm trying to install Airbyte in K8's in our internal platform. Can someone please share me from where I can get the latest airbyte Image. I tried to pull from docker registry but getting access denied. Can someone please share the link where I can pull and deploy it in Kubernetes using helm. Greatly appreciate your help
    u
    • 2
    • 1
  • j

    Jordi Crespo Guzman

    11/05/2024, 12:04 AM
    If I want to generate an internal airbyte source for my company, in python, which has to be custom, I can use the python-connector-base image to create a dockerfile, and use the cd airbyte-integrations/connector-templates/generator ./generate.sh comadno to generate the necessary files and from there pack everything in that docker file? is this still a correct way to create new internal sources?
    u
    • 2
    • 1
  • h

    Huy Đỗ Văn

    11/05/2024, 3:56 AM
    Copy code
    class BigQueryTableStream(IncrementalMixin, Stream):
        source_defined_cursor = False
        cursor_field = "updated_at"
        _cursor_value = ""
    I got error how to get a cursor have been defined by user on webapp Caused by: io.temporal.failure.ApplicationFailure: message='Source defined cursor validation failed for stream: table_50mb1. Error: key: 'updated_at' of path: '[updated_at]' not found in schema,Source defined cursor validation failed for stream: table_data_20gb. Error: key: 'updated_at' of path: '[updated_at]' not found in schema,Source defined cursor validation failed for stream: table_test2. Error: key: 'updated_at' of path: '[updated_at]' not found in schema,Source defined cursor validation failed for stream: assignment. Error: key: 'updated_at' of path: '[updated_at]' not found in schema,Source defined cursor validation failed for stream: user_magazine_count. Error: key: 'updated_at' of path: '[updated_at]' not found in schema', type='io.airbyte.workers.exception.WorkerException', nonRetryable=false
    u
    • 2
    • 2
  • h

    Huy Đỗ Văn

    11/05/2024, 4:00 AM
    please give python code example define cursor_field from UI
    u
    • 2
    • 2
  • i

    Ishan Anilbhai Koradiya

    11/05/2024, 5:11 AM
    Hi @kapa.ai how to setup auto scaling on airbyte on kubernetes with eks ?
    u
    • 2
    • 1
  • t

    Tom Montgomery

    11/05/2024, 6:43 AM
    @kapa.ai I have added a new column to a schema. I have only subsequently checked the option to "automatically backfill new columns". As such, the value of this column is missing for much of the data in the destination (as I am using incremental append & dedupe). What is the most efficient way to backfill the new column? Can I do it without refreshing the stream?
    u
    • 2
    • 1
  • p

    Pramod Bhat

    11/05/2024, 7:32 AM
    Hi Everyone, I have deployed airbyte in azure vm and i am using salesforce connector ... with same client id secret and token .. row count i get from postman is matching source salesforce data but in airbyte i tried all methods,, bulk api etc still rows are lesser than source.. and mismatch in number of rows
    u
    • 2
    • 1
  • p

    Pramod Bhat

    11/05/2024, 7:43 AM
    how can i downgrade to lower version of salesforce connector in my airbyte deployed app in azure for ex. 2.6.1 to 2.3.1
    u
    • 2
    • 1
  • k

    Katalina Villamil

    11/05/2024, 9:02 AM
    @kapa.ai In Shopify, I am able to download my client’s product sales report and it gives a column for “product_option” that specifies whether the specific product in an order (that can contain multiple products) is either bought as a one-time purchase or subscription. I am now loading this client’s Shopify data into BigQuery via Airbyte and can’t seem to find that line item “product_option” for the life of me. Which table I can find this?
    u
    • 2
    • 1
  • m

    Mor Iluz

    11/05/2024, 9:46 AM
    I created a custom source using cdk and after the image was build using the airbyte-ci when i try to deploy it in kubernetes it fails. the spec pod log is: Defaulted container "connector-sidecar" out of: connector-sidecar, main, init (init) Unsetting empty environment variable 'DATA_PLANE_SERVICE_ACCOUNT_CREDENTIALS_PATH' Unsetting empty environment variable 'KEYCLOAK_INTERNAL_REALM_ISSUER' Unsetting empty environment variable 'DATA_PLANE_SERVICE_ACCOUNT_EMAIL' Unsetting empty environment variable 'KEYCLOAK_CLIENT_ID' Unsetting empty environment variable 'AIRBYTE_API_AUTH_HEADER_NAME' Unsetting empty environment variable 'CONTROL_PLANE_AUTH_ENDPOINT' Unsetting empty environment variable 'AIRBYTE_API_AUTH_HEADER_VALUE' 2024-11-05T093832.636882165Z main INFO Loading mask data from '/seed/specs_secrets_mask.yaml 2024-11-05 093833 INFO i.m.c.e.DefaultEnvironment(<init>):168 - Established active environments: [k8s, cloud, worker-v2, control-plane, oss, local-secrets] 2024-11-05 093833 INFO i.a.c.ApplicationKt(main):20 - Context started 2024-11-05 093833 INFO i.a.c.ApplicationKt(main):21 - 1031750241.00 ns/exec (total: 1.03s, 1 executions) 2024-11-05 093834 INFO i.a.c.ApplicationKt(main):28 - Sidecar created 2024-11-05 093834 INFO i.a.c.ApplicationKt(main):29 - 767132358.00 ns/exec (total: 1.53s, 2 executions) 2024-11-05 093834 INFO i.a.c.i.LineGobbler(voidCall):166 - 2024-11-05T093834.885102616Z pool-5-thread-1 ERROR Recursive call to appender SecretMaskRewrite 2024-11-05 093835 INFO i.a.c.i.LineGobbler(voidCall):166 - ----- START SPEC ----- 2024-11-05 093835 INFO i.a.c.i.LineGobbler(voidCall):166 - 2024-11-05 093835 WARN c.a.l.CommonsLog(warn):113 - JAXB is unavailable. Will fallback to SDK implementation which may be less performant.If you are using Java 9+, you will need to include javax.xml.bind:jaxb-api as a dependency.
    u
    u
    +6
    • 9
    • 12
  • d

    Duong Le

    11/05/2024, 9:48 AM
    I build a connector in Airbyte UI. The response return a list of item in data fields. I would like to get the max time (end_time) in a field of each item. I would like to use transform feature to add this max time to output.
    u
    • 2
    • 1
  • v

    Vitalik Badion

    11/05/2024, 10:09 AM
    Hi Community, we recently start having issue with Chargebee connector :
    https://{domain}.<http://chargebee.com/api/v2/plans?sort_by%5Basc%5D=updated_at&include_deleted=true&updated_at%5Bbetween%5D=%5B1641027600%2C+1643705999%5D|chargebee.com/api/v2/plans?sort_by%5Basc%5D=updated_at&include_deleted=true&updated_at%5Bbetween%5D=%5B1641027600%2C+1643705999%5D> failed with status code 400 and error message The API endpoint is incompatible with the product catalog version. You are calling product catalog 1.0 API endpoint but you are using product catalog 2.0
    I guess it’s related to latest commit: https://github.com/airbytehq/airbyte/commit/7c6b528e1f057921d24b48faee4ec31a5782a145 & PR We still observing this issues for plans, addons endpoints. Could you please help us to understand whether this is known issue?
    u
    • 2
    • 1
  • t

    Tom Montgomery

    11/05/2024, 11:01 AM
    @kapa.ai We currently manage our airbyte oss app using
    abctl
    on an EC2 instance. I would like to migrate over to using terraform to manage the instance. Are there any steps to do this?
    u
    u
    +9
    • 12
    • 16
  • u

    user

    11/05/2024, 11:33 AM
    To set up Airbyte on GKE for production on GCP using Terraform, with GCS instead of MinIO and Cloud SQL instead of the default Postgres, you'll need to follow these steps: 1. Configure GCS for storage: In your Terraform configuration, you'll need to set up GCS for log and state storage. Here's an example of how to configure this in your
    values.yaml
    file:
    Copy code
    yaml    global:      storage:        type: "GCS"        storageSecretName: airbyte-config-secrets        bucket:          log: airbyte-bucket          state: airbyte-bucket          workloadOutput: airbyte-bucket        gcs:          projectId: <project-id>
    You'll also need to create a Kubernetes secret containing the GCP credentials for accessing GCS [Implementation Guide](https://docs.airbyte.com/enterprise-setup/implementation-guide#configuring-external-logging). 2. Configure Cloud SQL for the database: To use Cloud SQL instead of the default Postgres, you'll need to disable the internal Postgres and configure the external database. Here's an example configuration for your `values.yaml`:
    Copy code
    yaml    postgresql:      enabled: false     global:      database:        secretName: "airbyte-config-secrets"        host: ""        port: ""        database: ""        user: ""        passwordSecretKey: ""
    Make sure to create and configure your Cloud SQL instance separately and provide the necessary credentials in the Kubernetes secret [Implementation Guide](https://docs.airbyte.com/enterprise-setup/implementation-guide#configuring-the-airbyte-database). 3. Set up Terraform configuration: Create the necessary Terraform files in your project directory: - `provider.tf`: Define the Airbyte provider - `main.tf`: Configure Airbyte resources - `variables.tf`: Define variables, including credentials You may also want to create separate modules for connections, destinations, and sources [API to Warehouse Tutorial](https://airbyte.com/tutorials/api-to-warehouse-basic-stack-with-airbyte). 4. Initialize and apply Terraform: Run the following commands to set up your Airbyte resources:
    Copy code
    terraform init    terraform plan    terraform apply
    [API to Warehouse Tutorial](https://airbyte.com/tutorials/api-to-warehouse-basic-stack-with-airbyte) Remember to adjust the configurations to match your specific GCP project settings and requirements. Also, ensure that you've created the necessary GCP resources (GKE cluster, Cloud SQL instance, GCS buckets) before applying the Terraform configuration. For a more detailed example of setting up Airbyte on GKE with external storage and database, you can refer to this [GitHub discussion](https://github.com/airbytehq/airbyte/discussions/47256) which provides additional Terraform configuration snippets.
  • r

    Rafael Anizio

    11/05/2024, 11:34 AM
    @kapa.ai How can I set up Airbyte on GKE for production on GCP using Terraform, with GCS instead of MinIO and Cloud SQL instead of the default Postgres?
    u
    • 2
    • 1
1...4445464748Latest