https://linen.dev logo
Join Slack
Powered by
# ask-community-for-troubleshooting
  • a

    Anton Svetlov

    11/29/2021, 11:23 AM
    When we are conneting the Amplitude we got the following error:
    Copy code
    ERROR[m i.a.i.d.b.BufferedStreamConsumer(close):219 - {} - Close failed.
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:350) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:322) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:308) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:284) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:279) ~[postgresql-42.2.18.jar:42.2.18]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:194) ~[commons-dbcp2-2.7.0.jar:2.7.0]
    [35mdestination[0m - 2021-11-29 08:30:33 INFO DefaultAirbyteStreamFactory(lambda$create$0):61 - 	at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:194) ~[commons-dbcp2-2.7.0.jar:2.7.0]
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - Exception in thread "main" org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:350)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:473)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:393)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:322)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:308)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:284)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:279)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:194)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at org.apache.commons.dbcp2.DelegatingStatement.execute(DelegatingStatement.java:194)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at io.airbyte.db.jdbc.JdbcDatabase.lambda$execute$0(JdbcDatabase.java:40)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at io.airbyte.db.jdbc.DefaultJdbcDatabase.execute(DefaultJdbcDatabase.java:52)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at io.airbyte.db.jdbc.JdbcDatabase.execute(JdbcDatabase.java:40)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.jdbc.JdbcSqlOperations.executeTransaction(JdbcSqlOperations.java:95)
    [35mdestination[0m - 2021-11-29 08:30:33 ERROR LineGobbler(voidCall):82 - 	at io.airbyte.integrations.destination.jdbc.JdbcBufferedConsumerFactory.lambda$onCloseFunction$3(JdbcBufferedConsumerFactory.java:179)
    u
    • 2
    • 1
  • s

    Steve West

    11/29/2021, 5:48 PM
    Hey! Cool to be part of the community- want to evaluate Airbyte for some of our SaaS integrations. Question- is the Embed functionality part of Airbyte open-source? It's really difficult to know whether it is or isn't from the overview page. Thanks!
    u
    • 2
    • 3
  • k

    Kamil Baldyga

    11/29/2021, 7:24 PM
    Hey there, we're thinking of trying out Airbyte Cloud for some of our workloads. We use AWS Redshift, and looking at the Destinations connector documentation, first point in the setup guide says
    Make sure your cluster is active and accessible from the machine running Airbyte
    Does this mean Airbyte Cloud doesn't support Redshift? If this is not correct, and the documentation needs updating, does it support SSH tunnel or the instance needs to have a public ip address?
    ✅ 1
    u
    • 2
    • 2
  • g

    Guus van Heijningen

    11/30/2021, 8:25 AM
    Hi all, We are ready to start working with Airbyte within the datawarehouse deployments of our clientele. As we are servicing multiple clients on different cloud projects (mainly GCP) I was wondering what the best setup would be to use Airbyte in a scalable but secure way. For some - but not all - projects we are also looking towards using Airbyte in combination with Airflow in order to schedule successive tasks. We run Airflow using kubernetes (similar to https://airflow.apache.org/docs/apache-airflow/stable/executor/kubernetes.html#fault-tolerance). Currently we are considering multiple options and I wanted to reach out to you guys to hear you opinion about this case as we could not find any blogs/documents about a similar setup. The following options are considered: 1. Airbyte deployment on a separate VM for each client project (hard to scale, but secure as only people with the right permissions to the project can access the VM) 2. One general Airbyte VM where all the connections are set up (scalable but prone to reach limitations and not very secure as the whole team needs to be able to access this VM for setting up connections) 3. Running Airbyte on Kubernetes with a separate server on the cluster for each client project (scalable but not sure if this will be secure as I don't know how we will access those servers without serving them to the internet as RBAC is not yet supported) 4. Other option(s) I would really value your opinion on this as I am not a cloud engineer by heart and would like to make a well-considered decision. Thanks!
    ✅ 1
    u
    • 2
    • 1
  • n

    Niclas Grahm

    11/30/2021, 10:23 AM
    Hi! 🙂 I'm having problems setting up airbyte + airflow with docker-compose. I can't get communication between airflow and airbyte working. I have followed the guide at https://github.com/airbytehq/airbyte/tree/master/resources/examples/airflow , including installing
    apache-airflow-providers-airbyte
    and setting up a connection in airflow pointing at
    '<airbyte://host.docker.internal:8000>'
    However, when trying to start my DAG in airflow, I get the following error:
    Copy code
    requests.exceptions.HTTPError: 404 Client error: Not found for url: <http://host.docker.internal:8000/api/v1/connections/sync>
    Hos OS is Ubuntu 18.04. Does anyone have any ideas on what I should try? Thanks in advance!
    ✅ 1
    u
    • 2
    • 11
  • h

    Hein Hofman

    11/30/2021, 12:11 PM
    Would it be possible to deploy Airbyte on Google Cloud Run? Or is there an other serverless option? I'm looking for a cost and maintenance effective for small daily syncs of advertising data to a destination.
    ✅ 2
    j
    u
    • 3
    • 3
  • b

    Bean L

    11/30/2021, 12:19 PM
    Hey everyone, I am new to Airbyte and just deployed a local dev to load data from postgres to snowflake then transform data there and load the transformed data in snowflake to anther postgres databases for use. I plan to use Airbyte to help me on load data from postgres -> snowflake then snowflake to postgres. Currently I am using basic normalisation option but it is created a _airbyte_raw table then create a expected table I wanted._ Because I wanted to make it just 1-1 mapping. Can I do it in airbyte?
    ✅ 1
    u
    u
    • 3
    • 7
  • e

    Emin Can Oğuz (Hepsiburada)

    11/30/2021, 12:36 PM
    Hi. We have a data in Hive and we want add BigQuery to using Airbyte. Is it possible? I search but I am not certainly find answer.
    • 1
    • 1
  • e

    Emma-Louise Upton

    11/30/2021, 1:30 PM
    hey 👋 Our team has been evaluating ELT options for integrating SF into our product. We were wondering if Airbyte's cloud offering would allow us to dynamically create Salesforce -> Kinesis connections per user without us having to facilitate the oauth exchange via our own SF app? Is there anywhere we can see a demo of the additional oauth support for airbyte cloud? Looking through the API documentation it doesn't seem like this is currently supported.
    u
    • 2
    • 1
  • e

    Emmanuel Orrego

    11/30/2021, 1:58 PM
    Hey! Our team Is trying to create a MSSQL and it's not being posible due to the user, because I haven't being able to set up the domain in the user, how do I add the Domaik to the connection string?
    u
    • 2
    • 1
  • d

    Dave Lyons

    12/01/2021, 3:07 PM
    I spun up locally and built a Postgres->Snowflake connection. It works great for about a minute or two, write a few thousand rows, and then…. just hangs? the sync logs stop populating, the target table ceases to grow, but the sync claims to still be running.
    ✅ 1
    u
    • 2
    • 2
  • s

    Scott Pedersen

    12/02/2021, 12:35 AM
    Hi There, I am interested in learning more about airbyte, specifically in the following areas: • airbyte CLI ◦ release date ◦ documentation • Pipelines as a Code ◦ Is there documentation on how to write airbyte pipelines, or is this a feature that will be coming in the future? Thanks
    ✅ 1
    h
    u
    • 3
    • 4
  • h

    han so

    12/02/2021, 10:47 PM
    if anyone has any info on how to set up airbyte using a docker images - i have access to a bunch of different one - init, db, source-jira, webapp, a few source and destination dockers images... not sure what they are all for.. i don't have access easily to the git repo to download the docker-compose yaml
    ✅ 1
    u
    • 2
    • 9
  • o

    Oscar Gonzalez

    12/03/2021, 2:25 PM
    Hello, does anyone know why the check box is not working for the table and the field checkboxes are missing?
    ✅ 1
    u
    • 2
    • 5
  • t

    Tom Griffin

    12/04/2021, 2:14 AM
    Hi :) I'm a volunteer working on vaccine surveillance and modeling for one of the counties in New York. Each day we have to retrieve an extract from New York State Immunization Information System (NYSIIS) of the last day's vaccinations. The files are named with the date they were cut (this was yesterday's file): 20211202_nysiis.csv.zip The plan is to dump CSVs into Postgres, do some transforms, then shift the data into Elasticsearch for analysis. I'm hoping someone would be willing to help me understand how best to use the Files source given that it does not yet support multiple files. I'm looking for guidance on how to handle the initial load + the ongoing daily incrementals. I could do preprocessing before the Airbyte process - but I was hoping to use this as an opportunity to learn best practices inside the Airbyte ecosystem. Thanks! :) 🙂
    ✅ 1
    s
    u
    • 3
    • 5
  • m

    M Andriansyah Putra

    12/06/2021, 5:11 AM
    Hello, i have a question about Airbyte docker worker. its create a new container right? how to automate add hosts everytime worker create new container? thanks 😁
    🤔 1
    👀 1
    u
    h
    • 3
    • 3
  • s

    Stefan Otte

    12/06/2021, 9:41 AM
    Hey there, TL;DR: what is the best way to run multiple
    docker-compose
    instances of airbyte in parallel without sharing any data/config? Locally I have to run multiple airbyte instances for completely separate projects and I want to use
    docker-compose
    for that. No data or configuration between the instances should be shared. The
    docker-compose.yaml
    contains a few volumes (
    workspace
    ,
    data
    ,
    db
    ) and I can change the mount point, but the
    .env
    file contains many more paths that would be the same between instances. Is there maybe a global prefix to change all these variables? Do I have to change the variable one by one? Or is there a better way?
    🤔 1
    • 1
    • 1
  • y

    Yuriy Markiv

    12/06/2021, 3:45 PM
    Hi team, I'm trying to use both Airbyte Cloud and Airbyte on-premise and my goal right now is to create Hubspot as a source. Just to clarify, we already use Hubspot as a source in another ETL tool and also in custom Python scripts which means that our API key is valid. So I can't make it work in Airbyte and wondering is it ready to use in prod?
    ✅ 1
    u
    • 2
    • 2
  • d

    Dave Lyons

    12/06/2021, 5:40 PM
    For SSL connections to PostgreSQL, where do I put the client-key.pem? Do I reference it in
    docker-compose.yaml
    and build it into the container, or do I need to bash in and copy it somewhere? I can’t find any instructions.
    ✅ 1
    u
    • 2
    • 1
  • d

    Danny Duncan

    12/06/2021, 6:43 PM
    Hello all, TL;DR - With regard to the MySQL connector, if I use CDC to capture incremental changes, will I have an issue with tables that don’t have changes within a certain period of time (e.g. table A doesn’t have a change within 7 days but the binary logs get removed after 7 days). Put another way, does CDC update the binary log reference when I sync even if the table does not have any updates? I’m new to Airbyte and looking to sync MySQL data to Snowflake. I’m actually coming from the Meltano framework so familiar with some concepts of data integration. One thing that I discovered in the underlying code for that framework is that binary log replication will only update the state when the table has a change. That is to say, if I have a table that updates very infrequently, the binary log of the last update to that table may get removed before a binary log has a new transaction. I thought I was keeping things fresh by updating everyday but have found out the hard way that many of my tables can no longer sync because the referenced binary log is now missing. I’m not sure if this is a requirement of binary log replication OR this is based on the underlying singer tap that I was using. So before I truly get started in this Airbyte, I’d like to understand if CDC incremental updates will refresh the referenced log even if the table doesn’t have any changes. Thanks all.
    ✅ 1
    u
    s
    • 3
    • 5
  • b

    Boggdan Barrientos

    12/06/2021, 7:26 PM
    Hi! 😄 I'm triying to update Airbyte but my version is < 0.32 so I'm doing the mandatory intermediate Upgrade. When I run
    git checkout v0.32.0-alpha
    I got
    error: pathspec 'v0.32.0-alpha' did not match any file(s) known to git.
    👀 1
    u
    • 2
    • 1
  • o

    Oscar Gonzalez

    12/06/2021, 9:36 PM
    Hi All, nother question. I am getting a timeout in Postgres while reading a big table. Is that a Airbyte parameter I need to change or is it in the Postgres side?
    ✅ 1
    u
    • 2
    • 4
  • m

    Max Krog

    12/07/2021, 9:35 AM
    What's this concept called? Our data stack has a lot of data being loaded in an EL fashion by Airbyte. This causes the source tables to contain a lot of duplicated information, as "incremental replication" is not supported for some sources. For example a customer record from Intercom could be synced every day, regardless of wether the customers data data has updated or not. To avoid the source table to build up in size at a very steep curve, i have a script running in a dbt hook that deletes all duplicate records from the source table. It goes something like this:
    Copy code
    create or replace table `project-name.airbyte_things._airbyte_raw_things` as (
    
      with base as (
        select
          *,
          row_number() over(partition by sha512(_airbyte_data) order by _airbyte_emitted_at) as hashed_data_rn -- hash the data and find the row_number
        from
          `project-name.airbyte_things._airbyte_raw_things`
      )
      select * except(hashed_data_rn) from base where hashed_data_rn = 1
    )
    Basically it hashes the raw data in the source table, orders the entries with the same hash, and deletes every record except the first for the hash. What is this process of "condensing"/"cleaning" a source table called? 🙂
    👏 1
    ✅ 1
    u
    • 2
    • 1
  • p

    Parthasarathy Balasubramanian

    12/07/2021, 2:40 PM
    👋 Hello, team! I am trying to send the incremental data of salesforce contacts to a kafka destination. But i get the full data (full refresh) in the kafka pipleline every time i run a sync. Any help on this please?
    👀 1
    u
    • 2
    • 4
  • m

    Maurice

    12/07/2021, 7:50 PM
    Hi all, I’m new here. Is it just me or is it only Airbyte understands what these mean on the signup page: • 5 most important sources* • 5 most important destinations* Is this normal speak? 🙂 How do I find someone to talk to about using Airbyte?
    ✅ 1
    a
    • 2
    • 1
  • b

    BERKIN

    12/08/2021, 4:56 AM
    how to normalize nested json object from mongo db
    h
    • 2
    • 1
  • d

    Divya (Proximity)

    12/08/2021, 5:21 AM
    Hi all, I’m new here. How do I find someone to talk to about using Airbyte? I need to know if the sync frequency can be changed.
    ✅ 1
    h
    • 2
    • 1
  • x

    xiaxp

    12/08/2021, 6:47 AM
    Hi all, can we use CRON syntax to trigger sync?
    h
    • 2
    • 1
  • x

    xiaxp

    12/08/2021, 6:48 AM
    Hi all, I’m new here. can we use CRON syntax to trigger sync?
    ✅ 1
    m
    j
    • 3
    • 3
  • m

    Matt Wright

    12/08/2021, 7:39 PM
    Setup my source and destination connectors, initial sync/refresh fails. Anyone have any insight?
    logs-2-1.txt
    ✅ 1
    u
    • 2
    • 3
1...151617...245Latest