https://linen.dev logo
Join Slack
Powered by
# contributing-to-airbyte
  • b

    Boggdan Barrientos

    12/28/2021, 11:45 PM
    Hi all, I currently want to change my cursor value, I access to the airbyte database and find my job in the tables jobs. I identify it by id or by scope. I check that the cursor value is set in a JSON inside config column. The config json looks like the attached in the thread. I was reading in how to change JSON values using
    jsonb_set
    , I'm writing like, but nothing is updated.
    Copy code
    UPDATE jobs SET config = jsonb_set(config, '{"sync":{"state"}}'::jsonb, '{"state": {"cdc": false, "streams": [{"cursor": "2021-10-29T00:00:00Z", "stream_name": "PROMOCIONESGENERADAS", "cursor_field": ["Fecha"], "stream_namespace": "CENTRALIZADOR"}]}}', false) WHERE id = 1611;
    I think that I'm writting wrong the path, what the correct way to do it? I think that is most a postgres question, but I hope find some guide here. Thanks.
    u
    j
    • 3
    • 3
  • r

    Raj

    12/29/2021, 9:13 AM
    Hi all, we are running airbyte(0.29.22) on a ubuntu with 16GB and 4core cpu. We found that airbyte is running atmost 4 jobs even when the resources are available. we changed SUBMITTER_NUM_THREADS to 100 just to check if this helps. Can you please suggest what might be the problem? How can we increase the number of parallel jobs ? After changing SUBMITTER_NUM_THREADS to 100, all the jobs(less than 100) status change to running but only 4 of them are actually running.
    k
    j
    • 3
    • 6
  • k

    Kashif Vikaas

    12/30/2021, 12:44 AM
    Where do I start?
    u
    • 2
    • 1
  • j

    Javier Llorente Mañas

    12/30/2021, 4:09 PM
    Hello, I am looking into the Airbyte provider for Airflow, and I don’t find this feature. I would like to know if it is possible to pass the Airflow dag execution date to an Airbyte Source before triggering a job? I think this feature will be highly beneficial for incremental Airbyte connectors, it will reduce the time of the pipeline as well it will help to create a functional data engineering approach.
    • 1
    • 1
  • g

    Gopinath Jaganmohan

    01/03/2022, 1:13 AM
    Hello Team, I was evaluating Airbyte Integration for our product. My setup and first run was easy and success full. One of the requirement we have is to push down SQL queries(run custom queries) on SQL based DBs. Is there way we can do that now?
    d
    v
    j
    • 4
    • 7
  • j

    Johan Wärlander

    01/03/2022, 11:50 AM
    Hey there, quick deployment question.. is there any update on status for ECS support? We're probably looking to migrate to Kubernetes as a long-term strategy, but right now we have a working ECS setup that we're comfortable with, so it would be nice to be able to leverage that -- especially as with Kubernetes, we're not yet fully familiar with production concerns like security, observability, etc.
    • 1
    • 1
  • m

    Mukul Gopinath

    01/04/2022, 9:11 AM
    Hey, I'm trying to connect to Google search console using Airbyte, facing issue with connection tests. Screenshot provided below. Also looking out for the API documentation. The one provided is obscure. Please help
    h
    • 2
    • 1
  • o

    Omri Cohen

    01/05/2022, 6:09 PM
    Hey there, is there any way to use airbyte dbt, WITHOUT git repo?
    u
    t
    • 3
    • 6
  • t

    tharaka prabath

    01/06/2022, 4:34 AM
    how can I secure airbyte ?? is this have a login screen ??? because anyone can access this without permission
    h
    n
    • 3
    • 5
  • s

    Serhii Chvaliuk [GL]

    01/06/2022, 1:38 PM
    Hello, who knows how to fix ?
    Copy code
    user@hrk1-ldl-A02942:~/airbyte$ SUB_BUILD=PLATFORM ./gradlew :airbyte-tests:automaticMigrationAcceptanceTest
    > Task :airbyte-tests:automaticMigrationAcceptanceTest
    
    MigrationAcceptanceTest > testAutomaticMigration() FAILED
        java.lang.NullPointerException: Cannot invoke "java.lang.Boolean.booleanValue()" because the return value of "io.airbyte.api.client.model.HealthCheckRead.getAvailable()" is null
            at io.airbyte.test.automaticMigrationAcceptance.MigrationAcceptanceTest.healthCheck(MigrationAcceptanceTest.java:327)
            at io.airbyte.test.automaticMigrationAcceptance.MigrationAcceptanceTest.lambda$testAutomaticMigration$0(MigrationAcceptanceTest.java:87)
            at io.airbyte.commons.concurrency.VoidCallable.call(VoidCallable.java:15)
            at io.airbyte.test.automaticMigrationAcceptance.MigrationAcceptanceTest.runAirbyte(MigrationAcceptanceTest.java:138)
            at io.airbyte.test.automaticMigrationAcceptance.MigrationAcceptanceTest.runAirbyte(MigrationAcceptanceTest.java:124)
            at io.airbyte.test.automaticMigrationAcceptance.MigrationAcceptanceTest.testAutomaticMigration(MigrationAcceptanceTest.java:85)
    
    1 test completed, 1 failed
    j
    • 2
    • 3
  • r

    Raj

    01/06/2022, 7:17 PM
    hello, We are using airbyte to sync data from various sources to various destinations. One of our destination is snowflake. we had a trial version of snowflake and it got expired today. I observed the following issues: • Even though there is an error the source docker container does not exit. • The docker container is stuck and can't even kill it using docker kill command • New jobs are not scheduled after one of this containers is stuck. Airbyte version 0.29.22. Running using docker compose. Please see the screenshot logs here:
    u
    • 2
    • 2
  • s

    Sergi van den Berg

    01/07/2022, 3:19 PM
    Hi everyone, I just started using Airbyte to sync data from various sources to my Amazon Redshift Warehouse. When setting up the connection with Pipedrive I keep encountering one error for a specific source stream when I use 'Basic normalization'. This only happens with the 'activities' source stream. If I select the Raw data insteads of Basic normalization the syncing succeeds. Does anybody recognize the error?
    • 1
    • 1
  • o

    Omri Cohen

    01/08/2022, 5:10 PM
    Hi again, 2 important things that I really need and don't find to much in documentation/GitHub: 1) the minimum sync option is 5 min? No way to <1 min? 2) is there any chance that in one connection can be few sources? Without those unfortunately I won't be able to use airbyte
    h
    á
    • 3
    • 2
  • n

    Naveen Sai Patnana

    01/10/2022, 8:06 AM
    We have triggered 50 jobs parallely(to stress test our server) after changing the MAX_SYNC_WORKERS to 50. Initially all the jobs are in running state and few jobs got successfully completed. At a point we hit resource limit of the system and all the jobs went to pending state. After the resources are avaliable none of the pending state jobs started. https://github.com/airbytehq/airbyte/issues/9375 Could you please help us out there :-)
    k
    • 2
    • 2
  • á

    Álvaro Queiroz

    01/10/2022, 2:09 PM
    Hello i want to test some changes to one connector i have built the image for destination-s3 locally with ./gradlew airbyte integrationsconnectorsdestination s3build the image is all right on my docker image list. but when i want to change the version on the webapp ui to dev, i get the error on the image below can anybody help me change the image airbyte uses for this connector?
    • 1
    • 2
  • d

    Darian

    01/11/2022, 4:26 PM
    Hi Team, I'm using incremental syncs (append_dedup) on a destination-postgres. One of my primary keys is of type timestamp. Is it somehow possible to only use the date part of the timestamp as primary key and ignore the time?
    s
    t
    • 3
    • 2
  • u

    user

    01/12/2022, 12:15 PM
    This message was deleted.
    • 0
    • 6
  • a

    Aakash Kumar

    01/12/2022, 12:25 PM
    Hi Community, Another issue we are facing with respect to adding transformation. We created a transformation and locally its running fine
    dbt run --vars '{"prefix": ""}' --select Products Orders Customers
    but upon executing the same in airbyte, it's removing single quotation mark, and hence it's failing.
    d
    u
    • 3
    • 5
  • m

    Mukul Gopinath

    01/12/2022, 4:31 PM
    Hey there, Im trying to connect Mailchimp source to snowflake destination. Elapsed 3 attempts with following error. Please help.
    u
    • 2
    • 1
  • d

    Davin Chia (Airbyte)

    01/13/2022, 4:34 AM
    Hey all, we are seeing build failures across the board, likely from the Github outage earlier today. We are monitoring
    • 1
    • 1
  • t

    Thomas

    01/13/2022, 7:43 AM
    Hey All, Im running airbyte in a K8s cluster on GCP. But im not really happy with the way i need to use a gcp service account to write the logs to the log bucket. I would rather create a K8s service account and give that access to the bucket. Is there someone who can help me with pinpointing where in Airbyte the logs are actually written to the GCS bucket?
    r
    j
    • 3
    • 4
  • r

    Ruslan

    01/13/2022, 1:38 PM
    Hey team, cool product! Trying to use it to read data from MongoDB (Atlas) - I receive error
    io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 69.997454900s
    after 100k rows read. Any way to configure timeouts?
    k
    • 2
    • 4
  • a

    Ariyo Kabir

    01/13/2022, 4:31 PM
    Hi everyone, I'm using airbyte to load data from my postgres database to the same postgres database but different schema, after setting all the connections, nothing is showing on my destination schema. What could have went wrong
    u
    • 2
    • 1
  • j

    Johan Wärlander

    01/13/2022, 5:01 PM
    Hey 👋 I'm testing Airbyte as a potential replacement for AWS DMS, and have a question about performance.. When syncing one of our larger tables, at 38 million rows (so not by any means huge, just on the larger side for our current environment), from Postgres to Snowflake, it's taking 3 full hours to finish. This is at least 2x as long as I was expecting (the DMS jobs finish in 1:30 - 2:30 depending on other workloads). What would be the best starting point for tweaking the settings to improve this? This is on an EC2 m5.2xlarge instance (8 cores, 32MB RAM).. But I get similar performance on our rather small Kubernetes test cluster.
    u
    k
    • 3
    • 4
  • k

    Khai Quang Nguyen

    01/13/2022, 5:36 PM
    Hi, testing Airbyte right now to connect from Postgres to Snowflake. An issue that I have seen repeatedly is on incremental-dedupe settings, Airbyte fails on subsequent syncs with the following error.
    Copy code
    2022-01-13 04:07:36 source > 2022-01-13 04:07:36 ERROR i.d.r.TableSchemaBuilder(lambda$createValueGenerator$5):269 - Failed to properly convert data value for 'public.investment_listing.updated_at' of type timestamp for row [null, null, null, null, 213, null, null, null, null, null, null, null, null, null, null, 917bb142-db72-4ae4-8a5c-7398db43f527, null]:
    2022-01-13 04:07:36 source > org.apache.kafka.connect.errors.DataException: Invalid value: null used for required field: "updated_at", schema type: STRING
    this seems like an issue with postgres DELETE since WAL will just store the primary key for cases like this. A potential solution I have seen from a PR is to change the replication identity to FULL, but this has performance issues on very large table. Any workaround for this? I would imagine this should have been a very common use case.
    • 1
    • 2
  • s

    Serhii Chvaliuk [GL]

    01/15/2022, 11:47 AM
    @Daniel Luftspring can you please fix this https://github.com/airbytehq/airbyte/blob/103c224eccc47685cece315fde8ce890108e9144[…]ntegrations/destination/snowflake/SnowflakeDestinationTest.java
    • 1
    • 19
  • j

    Jarrod Parkes

    01/17/2022, 10:11 PM
    it has been awhile since ive been active on this slack, where would be the best place to look for transforming data from a source connector before it is sent to a destination? can Airbyte do data transformations like this?
    u
    • 2
    • 1
  • j

    Jarrod Parkes

    01/17/2022, 10:32 PM
    Airbyte is focused on the EL of ELT. If you need a really featureful tool for the transformations then, we suggest trying out dbt.
    https://docs.airbyte.com/understanding-airbyte/basic-normalization (looking into custom dbt transformations now)
    u
    • 2
    • 1
  • k

    Kabilan Ravi

    01/18/2022, 1:32 PM
    Hi team, I am trying to use csv as source and got the error saying one of the expected column should be in string. eg, "externalid" - Should be 'string', but found 'integer' . This is the error msg. How can i typecast this particular column from int to string. Found a way through "Additional Reader Options" . But sample code for this type cast will be appreciated. Thanks in advance
    • 1
    • 11
  • r

    Raj

    01/18/2022, 5:24 PM
    Hi All, in one the latest releases few env variables such as _MAX_*WORKERS_ are added. I was wondering what is the best way to set this variables to use the server efficiently as possible in single server environment. And also what is the role of _submitter_threads_ in this? In one of the messages someone pointed that they have set MAX_SYNC_WORKERS to 50. and all the jobs are scheduled. After sometime all the jobs moved to pending state and never started again!! How can we handle this scenario ?
    h
    • 2
    • 2
1...2021222324Latest